I need to find out the Top 20 sites within my sourcetype and then from there be able to do further analysis on other fields such as Product.
Are there a non-destructive stats command I can use for this?
i.e.
sourcetype=site_data | stats count by "Site Name" | head 20
and then a subsequent search to find out of those twenty sites what is the top product logged?
Thanks,
Jack
You could use a subsearch:
sourcetype=site_data [|search sourcetype=site_data | top 20 "Site Name" | fields "Site Name"] <put the rest of your search here>
https://docs.splunk.com/Documentation/Splunk/7.3.0/Search/Usesubsearchtocorrelateevents
Like this:
index=YouShouldAlwaysSpecifyAnIndex sourcetype=site_data
[search index=YouShouldAlwaysSpecifyAnIndex sourcetype=site_data
| top limit=20 "Site Name" | table "Site Name" | format]
| top limit=1 Product BY "Site Name"
P.S. field names with spaces are E*V*I*L!
You could use a subsearch:
sourcetype=site_data [|search sourcetype=site_data | top 20 "Site Name" | fields "Site Name"] <put the rest of your search here>
https://docs.splunk.com/Documentation/Splunk/7.3.0/Search/Usesubsearchtocorrelateevents
To add on to this answer, the subsearch provided by spayneort effectively returns the top 20 "Site Name" values as 20 "OR" seperated field=value pairs.
To further understand it, Splunk performs the subsearch first then essentially modifies your search to be:
sourcetype=site_data "Site Name"=https://url1 OR "Site Name"=https://url2 OR "Site Name"=http://url2 OR ....
Thanks guys this has worked as expected! Knew there must be a simple solution.