Splunk Search

How to group by by email and badge type , then locate the highest level badge & expiry date?

oneemailall
Engager

Cheers,

I am hoping to get some help on a splunk search to generate a badging report.

I'll explain further.

There are two types of badges students can earn, Sell & Deploy.
There are three levels of badges within each badge type. The levels are Novice, Capable and Expert.
Issued badges expire after one year. This means students must either renew their existing badge before the expiration date or the student can earn the next level higher badge prior to the expiration date. If a student renews their existing badge, the internal system marks the badge name as Renew_Novice, Renew_Capable, or Renew_Expert depending on which badge they earn. I've supplied some demo data to help illustrate what the data looks like.

I need to generate a report that lists the student's name, email address, highest level badge name and expiration date of the highest level badge. There is no need to see lower level badges or expiration dates for lower level badges. Thank you.

Each event is a student name and badge type. I onboarded the data so that the timestamp for each event ( _time) is the EarnDate of the badge

The output of the Splunk report should show the following:

Domain, First name, Last name, Email, Badge, ExpireDate
mno.com, lisa edwards, lisa.edwards@mno.com, Sell_Expert, 12/6/23
mno.com, lisa edwards, lisa.edwards@mno.com, Deploy_Capable, 8/1/24
abc.com, allen anderson, allen.anderson@abc.com, Sell_Novice, 10/3/24
def.com, andy braden, andy.braden@def.com, Deploy_Capable, 1/3/24
ghi.com, bill connors, bill.connors@ghi.com, Sell_Novice, 10/17/23
jkl.com, brandy duggan, brandy.duggan@jkl.com, Sell_Expert, 9/5/24

Demo Data below.

First nameLast nameEmailDomainBadgeEarnDateExpireDate
lisaedwardslisa.edwards@mno.commno.comSell_Novice5/22/225/22/23
lisaedwardslisa.edwards@mno.commno.comDeploy_Novice5/27/225/27/23
andybradenandy.braden@def.comdef.comDeploy_Novice11/10/2211/10/23
allenandersonallen.anderson@abc.comabc.comSell_Novice11/18/2211/18/23
andybradenandy.braden@def.comdef.comDeploy_Capable1/3/231/3/24
billconnorsbill.connors@ghi.comghi.comSell_Novice10/17/2210/17/23
brandydugganbrandy.duggan@jkl.comjkl.comSell_Novice7/6/237/6/24
lisaedwardslisa.edwards@mno.commno.comSell_Capable7/24/227/24/23
lisaedwardslisa.edwards@mno.commno.comDeploy_Capable8/20/228/20/23
brandydugganbrandy.duggan@jkl.comjkl.comSell_Capable8/10/238/10/24
brandydugganbrandy.duggan@jkl.comjkl.comSell_Expert9/5/229/5/24
allenandersonallen.anderson@abc.comabc.comRenew_Sell_Novice10/3/2310/3/24
lisaedwardslisa.edwards@mno.commno.comSell_Expert12/6/2212/6/23
lisaedwardslisa.edwards@mno.commno.comRenew_Deploy_Capable8/1/238/1/24
Labels (1)
Tags (1)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

First of all, thank you for the excellent statement of the problem with sample data and desired output.

So, part of Badge is coded as level, and part of it as type.  You want to sort the badge by level, take the highest level in each type and the latest expiration date.  Correct?  Here you go

 

| eval type = split(Badge, "_")
| eval level = mvfind(mvappend("Novice", "Capable", "Expert"), mvindex(type, -1)) + 1
| fillnull level
| eval type = mvindex(type, -2)
| eval expire_ts = strptime(ExpireDate, "%m/%d/%y")
| sort - level, expire_ts, + "Last name" "First name"
| dedup Domain, "First name", "Last name", Email, type
| table Domain, "First name", "Last name", Email, Badge, ExpireDate

 

Your sample data gives

DomainFirst nameLast nameEmailBadgeExpireDate
jkl.combrandydugganbrandy.duggan@jkl.comSell_Expert9/5/24
mno.comlisaedwardslisa.edwards@mno.comSell_Expert12/6/23
mno.comlisaedwardslisa.edwards@mno.comRenew_Deploy_Capable8/1/24
def.comandybradenandy.braden@def.comDeploy_Capable1/3/24
abc.comallenandersonallen.anderson@abc.comRenew_Sell_Novice10/3/24
ghi.combillconnorsbill.connors@ghi.comSell_Novice10/17/23

I'm not sure why your desired output doesn't use the "Renew" prefix.  If I understand it correctly, "Renew_" means that the badge has yet to be renewed.  But if you want to get rid of it, just add:

 

| eval Badge = replace(Badge, "Renew_", "")

 

Here is an emulation that you can play with and compare with real data

 

| makeresults format=csv data="First name,Last name,Email,Domain,Badge,EarnDate,ExpireDate
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Novice,5/22/22,5/22/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Deploy_Novice,5/27/22,5/27/23 
andy,braden,andy.braden@def.com,def.com,Deploy_Novice,11/10/22,11/10/23
allen,anderson,allen.anderson@abc.com,abc.com,Sell_Novice,11/18/22,11/18/23
andy,braden,andy.braden@def.com,def.com,Deploy_Capable,1/3/23,1/3/24    
bill,connors,bill.connors@ghi.com,ghi.com,Sell_Novice,10/17/22,10/17/23
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Novice,7/6/23,7/6/24
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Capable,7/24/22,7/24/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Deploy_Capable,8/20/22,8/20/23
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Capable,8/10/23,8/10/24
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Expert,9/5/22,9/5/24
allen,anderson,allen.anderson@abc.com,abc.com,Renew_Sell_Novice,10/3/23,10/3/24
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Expert,12/6/22,12/6/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Renew_Deploy_Capable,8/1/23,8/1/24"
``` data emulation above ```

 

 

View solution in original post

Tags (2)

yuanliu
SplunkTrust
SplunkTrust

First of all, thank you for the excellent statement of the problem with sample data and desired output.

So, part of Badge is coded as level, and part of it as type.  You want to sort the badge by level, take the highest level in each type and the latest expiration date.  Correct?  Here you go

 

| eval type = split(Badge, "_")
| eval level = mvfind(mvappend("Novice", "Capable", "Expert"), mvindex(type, -1)) + 1
| fillnull level
| eval type = mvindex(type, -2)
| eval expire_ts = strptime(ExpireDate, "%m/%d/%y")
| sort - level, expire_ts, + "Last name" "First name"
| dedup Domain, "First name", "Last name", Email, type
| table Domain, "First name", "Last name", Email, Badge, ExpireDate

 

Your sample data gives

DomainFirst nameLast nameEmailBadgeExpireDate
jkl.combrandydugganbrandy.duggan@jkl.comSell_Expert9/5/24
mno.comlisaedwardslisa.edwards@mno.comSell_Expert12/6/23
mno.comlisaedwardslisa.edwards@mno.comRenew_Deploy_Capable8/1/24
def.comandybradenandy.braden@def.comDeploy_Capable1/3/24
abc.comallenandersonallen.anderson@abc.comRenew_Sell_Novice10/3/24
ghi.combillconnorsbill.connors@ghi.comSell_Novice10/17/23

I'm not sure why your desired output doesn't use the "Renew" prefix.  If I understand it correctly, "Renew_" means that the badge has yet to be renewed.  But if you want to get rid of it, just add:

 

| eval Badge = replace(Badge, "Renew_", "")

 

Here is an emulation that you can play with and compare with real data

 

| makeresults format=csv data="First name,Last name,Email,Domain,Badge,EarnDate,ExpireDate
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Novice,5/22/22,5/22/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Deploy_Novice,5/27/22,5/27/23 
andy,braden,andy.braden@def.com,def.com,Deploy_Novice,11/10/22,11/10/23
allen,anderson,allen.anderson@abc.com,abc.com,Sell_Novice,11/18/22,11/18/23
andy,braden,andy.braden@def.com,def.com,Deploy_Capable,1/3/23,1/3/24    
bill,connors,bill.connors@ghi.com,ghi.com,Sell_Novice,10/17/22,10/17/23
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Novice,7/6/23,7/6/24
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Capable,7/24/22,7/24/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Deploy_Capable,8/20/22,8/20/23
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Capable,8/10/23,8/10/24
brandy,duggan,brandy.duggan@jkl.com,jkl.com,Sell_Expert,9/5/22,9/5/24
allen,anderson,allen.anderson@abc.com,abc.com,Renew_Sell_Novice,10/3/23,10/3/24
lisa,edwards,lisa.edwards@mno.com,mno.com,Sell_Expert,12/6/22,12/6/23
lisa,edwards,lisa.edwards@mno.com,mno.com,Renew_Deploy_Capable,8/1/23,8/1/24"
``` data emulation above ```

 

 

Tags (2)

oneemailall
Engager

Hi yuanliu,

Thank you for answering my query.  I am still trying to figure out why your solution works. 😀 I was able to modify it as you suggested to get the report I needed. I hope I correctly gave you credit for the solution.  I hope you not only get karma points in this community but also good karma points in life. 

Cheers. 

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @oneemailall .. I created sample logs as a CSV file, uploaded it to Splunk, created a new field ExpireDataEpoch, sorted the sample logs using the new field ExpireDataEpoch, created the report. 

i hope, you can adjust the SPL and fine-tune your requirements. 

 

As you are a new member, let me suggest you, Karma points appreciated, thanks. 

Verifying Epoch conversion:
source="csv-groupby.txt" sourcetype="csv" 
| eval ExpireDateEpoch=strptime(ExpireDate,"%m/%d/%y")
|table Domain, Firstname, Lastname, Email, Badge, ExpireDate, ExpireDateEpoch | sort ExpireDateEpoch

ExpireDate Sorted Report:
source="csv-groupby.txt" sourcetype="csv" 
| eval ExpireDateEpoch=strptime(ExpireDate,"%m/%d/%y")
| sort ExpireDateEpoch |table Domain, Firstname, Lastname, Email, Badge, ExpireDate

strptime-badge.jpgstrptime-badge-sorted.jpg

0 Karma

oneemailall
Engager

Hi inventsekar,

I sincerely appreciate your spending time to solve my issue.  I tested your suggestion, but still ended up with duplicate entries for people with the same type of badge (Sell or Deploy).  For example, Brandy Duggan should have only one entry, the highest level of Badge type "Sell" which is Brandy Duggan, Sell_Expert, 9/5/24. 

The results should look similar to this.

Domain, First name, Last name, Email, Badge, ExpireDate
mno.com, lisa edwards, lisa.edwards@mno.com, Sell_Expert, 12/6/23 (only show highest level badge of type "Sell")
mno.com, lisa edwards, lisa.edwards@mno.com, Deploy_Capable, 8/1/24 (only show highest level badge of type "Deploy")
abc.com, allen anderson, allen.anderson@abc.com, Sell_Novice, 10/3/24 (allen anderson renewed his badge and expiry date is updated to reflect that)
def.com, andy braden, andy.braden@def.com, Deploy_Capable, 1/3/24
ghi.com, bill connors, bill.connors@ghi.com, Sell_Novice, 10/17/23
jkl.com, brandy duggan, brandy.duggan@jkl.com, Sell_Expert, 9/5/24

 

Thank you again for helping me.  

Cheers.

 

 

0 Karma

inventsekar
SplunkTrust
SplunkTrust

@oneemailall wrote:

 I am still trying to figure out why your solution works.


 

Hi @oneemailall ..
Please note that, on my reply i said that you will need to fine-tune this further.  and nice to know the other reply works perfectly as you are expecting. 

As you were saying, you are trying to figure out why that solution works, let me try to explain,..

| eval type = split(Badge, "_")
``` Splitting the "Badge" field by the underscore, you get the "type" of the badge```
| eval level = mvfind(mvappend("Novice", "Capable", "Expert"), mvindex(type, -1)) + 1
``` the mvappend, mvindex are multivalue commands, understanding them takes a looonger time. pls check the docs https://docs.splunk.com/Documentation/SCS/current/SearchReference/MultivalueEvalFunctions ```
| fillnull level
| eval type = mvindex(type, -2)
| eval expire_ts = strptime(ExpireDate, "%m/%d/%y")
``` to sort the ExpireDate, first you need to convert to epoch timeformat```
| sort - level, expire_ts, + "Last name" "First name"
| dedup Domain, "First name", "Last name", Email, type
```sorting and dedup done nicely, you can table the output by below command```
| table Domain, "First name", "Last name", Email, Badge, ExpireDate
0 Karma

oneemailall
Engager

Hi inventsekar,

Thank you for explaining the parts of your search. The suggestions you gave helped to create a report that gives a list of unique badges (up to 3 entries for Sell and up to 3 entries for Deploy) per user, showing the badge with the latest expiration date.

The challenge now is to filter the results so that each person only shows the highest level badge earned (per badge type). So, each person will have a maximum of two entries in the final report if they have earned a Sell type badge and a Deploy type badge.

To answer your previous question, users who pass the renewal badge will have the badge "Renew_" prefix in the badge field. Your suggestion to remove the prefix worked. Thank you for the suggestion. When a renew badge is earned, it will extend the original badge expiration date further out in time.

Using the demo data I provided, the final report should look like below.

 

DomainFirst nameLast nameEmailBadgeExpireDate
mno.comlisaedwardslisa.edwards@mno.comSell_Expert12/6/23
mno.comlisaedwardslisa.edwards@mno.comDeploy_Capable8/1/24
abc.comallenandersonallen.anderson@abc.comSell_Novice10/3/24
def.comandybradenandy.braden@def.comDeploy_Capable1/3/24
ghi.combillconnorsbill.connors@ghi.comSell_Novice10/17/23
jkl.combrandydugganbrandy.duggan@jkl.comSell_Expert9/5/24

 

Assuming that the data set has already been filtered to produce a list of badges with the latest expiration date, what type of search can look at the badge type (per person), and filter to the highest level badge (per badge type) and return only that data?

For example, let's take "Person A". For the Sell type badge, the highest level is Sell_Expert. If "Person A" has earned "Sell_Expert", then only "Sell_Expert" (with the latest expiration date), will show up in the report for the Sell type badge.

If "Person A" has not earned Sell_Expert yet, then the highest level for "Person A" is Sell_Capable. 

If "Person A" has not earned Sell_Capable yet, then the highest level for "Person A" is Sell_Novice.

If "Person A" has not earned any Sell type badge yet, they will not have a Sell badge type listed in the final report.

It will be the same logic for Deploy type badge.

Thank you.

 

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...