Splunk Search

Rank data from web access files

ndoshi
Splunk Employee
Splunk Employee

I have web content (articles, stories) where each article is grouped in a category such as NEWS, STORY, etc. Website visitors are grouped by region. In each region, I want to be able to rank each category by the number of site visitors who read articles in a category.

I can get a count by region and category.

I can get a count by Region, VisitorID, Category.

However, I want to know how many site visitors had CAT1 as their most-read category. How many had CAT1 as their second most-read category? How many had CAT2 as their most-read category?

Here's an example:

**Region    VID      Category   # Visitors who ranked this 1st**
NY           87        STORY               10
NY           44        STORY                9
LA           98        NEWS                 4
1 Solution

eelisio2
Path Finder

Try aggregating by region, visitor id, and classification. Then, use streamstats to create the rank field.

sourcetype="mydata" geo_region=XXX | eval Classification=case(like(pagename, "home%"), "HOME",like(pagename, "%news%") AND like(pagename, "%story%"), "STORY",like(pagename, "%markets%"), "MARKETS",like(pagename, "%personalfinance%"), "PERSONALFINANCE",like(pagename, "%search%"), "SEARCH",like(pagename, "%news%"), "NEWS")|stats count as hitcount by geo_region, visid, Classification | sort geo_region,visid,-hitcount | streamstats count as rank by visid | stats count as rankcount by geo_region, Classification, rank

Breaking down the search command, here are the results from each section:

stats count as hitcount by geo_region, visid, Classification

geo_region      visid   Classification          hitcount
CA              100     HOME                    5
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              200     NEWS                    5
CA              200     HOME                    10

You want to sort in order of the number of hits most to fewest
sort geo_region,visid,-hitcount

geo_region      visid   Classification          hitcount
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              100     HOME                    5
CA              200     HOME                    10
CA              200     NEWS                    15

Use streamstats to count the number of rows for each visitor id. By starting with 1 for each visitor id, this creates a rank field.

streamstats count as rank by visid

geo_region      visid   Classification          hitcount        rank
CA              100     NEWS                    10              1
CA              100     MARKETS                 7               2
CA              100     HOME                    5               3
CA              200     NEWS                    15              1
CA              200     HOME                    10              2

Then, count the number of rows per Classification and rank

stats count as rankcount by geo_region, Classification, rank

Classification          rank            rankcount
NEWS                    1               2
MARKETS                 2               1
HOME                    2               1
HOME                    3               1

View solution in original post

eelisio2
Path Finder

Try aggregating by region, visitor id, and classification. Then, use streamstats to create the rank field.

sourcetype="mydata" geo_region=XXX | eval Classification=case(like(pagename, "home%"), "HOME",like(pagename, "%news%") AND like(pagename, "%story%"), "STORY",like(pagename, "%markets%"), "MARKETS",like(pagename, "%personalfinance%"), "PERSONALFINANCE",like(pagename, "%search%"), "SEARCH",like(pagename, "%news%"), "NEWS")|stats count as hitcount by geo_region, visid, Classification | sort geo_region,visid,-hitcount | streamstats count as rank by visid | stats count as rankcount by geo_region, Classification, rank

Breaking down the search command, here are the results from each section:

stats count as hitcount by geo_region, visid, Classification

geo_region      visid   Classification          hitcount
CA              100     HOME                    5
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              200     NEWS                    5
CA              200     HOME                    10

You want to sort in order of the number of hits most to fewest
sort geo_region,visid,-hitcount

geo_region      visid   Classification          hitcount
CA              100     NEWS                    10
CA              100     MARKETS                 7
CA              100     HOME                    5
CA              200     HOME                    10
CA              200     NEWS                    15

Use streamstats to count the number of rows for each visitor id. By starting with 1 for each visitor id, this creates a rank field.

streamstats count as rank by visid

geo_region      visid   Classification          hitcount        rank
CA              100     NEWS                    10              1
CA              100     MARKETS                 7               2
CA              100     HOME                    5               3
CA              200     NEWS                    15              1
CA              200     HOME                    10              2

Then, count the number of rows per Classification and rank

stats count as rankcount by geo_region, Classification, rank

Classification          rank            rankcount
NEWS                    1               2
MARKETS                 2               1
HOME                    2               1
HOME                    3               1
Get Updates on the Splunk Community!

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...