Hi!
I have an issue with a query and the dedup command.
| eval service=case(
(method="GET" AND match(uri, "/v1/[a-zA-Z]{2}/customers\?.*searchString=[^&]+.*")), "FindCustomer",
(method="GET" AND match(uri, "/v2/[A-Za-z]{2}/private-customers(/[a-zA-Z0-9-]+)?(?!/)")), "ReadCustomer")
| stats count by service
Unfortunately I have been noticing that my events matching to "ReadCustomer" are logged twice. Therefore I get two events right after each other with a couple of seconds in between, which is polluting my results. I need to somehow duplicate events, which have the same uri and happen within 10s of each other. I was thinking to use
|dedup uri
but realized that I want to allow the same uri, if it is more than 10 seconds between the events. If dedup could take a span, that would be the optimal way for me.
Does anyone have a good idea on how to solve this? I was also thinking about | transaction as well but I'm not sure if I can use it...
Hi
what you have in _raw data? Are those real duplicate events or those real events which really should be on logs?
If those are correctly in logs and there should be "same" event twice, you probably could mark "duplicates" with streamstats adding some count and then removing those duplicate on your stats count line?
see. https://docs.splunk.com/Documentation/Splunk/9.2.0/SearchReference/Streamstats
Something like
...
| <set your service>
| streamstats time_window=10s count as dup_count by service, <other fields to match events correctly>
| where dup_count < 2
| stats ....
r. Ismo