Hello, new Splunk user here struggling a bit with my first real deep dive into SPL. I am having some trouble using Splunk to replace an Excel report of failed wireless authentications, and I need a bit of help grouping the fields properly. Each failed authentication contains three fields I am interested in:
UserName
host
Calling_Station_ID
The Excel report is sorted by the number of failure events per UserName. Then for each user, it's broken out into the number of failure events per host. Then for each host, it's broken out into the number of failure events per Calling_Station_ID. So for example the report might look like this for the user with the most failed authentications in a week:
UserName: Failed Authentications:
user1 5689
server1: 2450
aa-aa-aa-aa-aa-aa 1170
bb-bb-bb-bb-bb-bb 776
cc-cc-cc-cc-cc-cc 504
server2: 1968
dd-dd-dd-dd-dd-dd 804
ee-ee-ee-ee-ee-ee 621
ff-ff-ff-ff-ff-ff 404
gg-gg-gg-gg-gg-gg 139
server3: 1271
hh-hh-hh-hh-hh-hh 555
ii-ii-ii-ii-ii-ii 333
jj-jj-jj-jj-jj-jj 289
kk-kk-kk-kk-kk-kk 94
I need to build these statistics for each UserName, with the users with the most total failed authentications at the top. The query I'm using to build the report is pretty simple, and it seems to contain all the events I need to build the report:
index=acs process=Failed-Attempt | stats count by UserName, host, Calling_Station_ID | sort -count
I've been trying many different permutations of the query above, but this is the closest I've been able to get. And there are obviously multiple issues with the output. The most obvious one to me is that a single UserName can appear multiple times, instead of being grouped by UserName. Can someone help me get closer to the output I provided above?
It was difficult that I thought. Give this a try.
index=acs process=Failed-Attempt | stats count by UserName, host, Calling_Station_ID| where count>50 | appendpipe [| stats sum(count) as count by UserName | sort -count | eval rank=1 | accum rank ] | appendpipe [| stats sum(count) as count by UserName host | sort -count | eval rank1=1 | accum rank1 ] | eventstats values(rank1) as rank1 by UserName , host | eventstats values(rank) as rank by UserName | fillnull rank1 value=0 | eval rank1=if(isnull(Calling_Station_ID),rank1,rank1+1)| sort rank rank1 -count | fields - rank*
Umm, wow. Thank you for for that work - that is much more complex than I expected.
It seems to be pretty close, when it finishes. But I keep getting "Job terminated unexpectedly", more often than not when I move the time picker beyond 15 minutes (I have to run this report over 7 days). The splunk log doesn't show any obvious signs of error - that I can see anyway.
Do you have any thoughts as to why it might crash like this, or where to look for clues?
How many records are you processing? Do you see any error in job inspector->search.log ? Also, if not already doing it, run the search in Fast Mode and see.
It crashed when a partial search returned 14,677 events (last 4 hours). The same query failed in both Smart Mode and Fast Mode. The last 60 minutes returned 2,705 events and your query ran OK.
Do you see any error in job inspector->search.log
That's the thing, no I don't see an error. The only message types in the splunk.log are INFO. But I did just notice the value of "eventCount" is 12500, and I see that number referenced in quite a few of the Execution Cost counts. Perhaps I need to increase this value?