Splunk Search

Compute row differences

guilhem
Contributor

Hello everyone!

I have a very simple result table that looks like this:

_time s duration
2/10/13 12:20:22.000 PM 21279054471791556300 0
2/10/13 12:20:43.000 PM 21279054471791556300 21
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:07.000 PM 21279054471791556300 45
2/10/13 12:21:52.000 PM 21279054471791556300 90
2/10/13 12:22:26.000 PM 21279054471791556300 124
2/10/13 12:22:59.000 PM 21279054471791556300 157
2/10/13 12:23:07.000 PM 21279054471791556300 165
2/11/13 5:09:16.000 AM 21242230731515268458 131
2/11/13 5:09:38.000 AM 21242230731515268458 153
2/11/13 5:09:46.000 AM 21242230731515268458 161
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:09:53.000 AM 21242230731515268458 168
2/11/13 5:10:19.000 AM 21242230731515268458 194
2/11/13 5:10:24.000 AM 21242230731515268458 199
2/11/13 5:10:37.000 AM 21242230731515268458 212
2/11/13 5:10:45.000 AM 21242230731515268458 220
2/11/13 5:10:50.000 AM 21242230731515268458 225

And I would like to compute a new field, let say diff, that will contains the difference between each duration, row by row: like this:

_time s duration diff
2/10/13 12:20:22.000 PM 21279054471791556300 0 / Nothing
2/10/13 12:20:43.000 PM 21279054471791556300 21 21
2/10/13 12:21:07.000 PM 21279054471791556300 45 23
2/10/13 12:21:07.000 PM 21279054471791556300 45 0
2/10/13 12:21:52.000 PM 21279054471791556300 90 45
2/10/13 12:22:26.000 PM 21279054471791556300 124 69
2/10/13 12:22:59.000 PM 21279054471791556300 157 33
2/10/13 12:23:07.000 PM 21279054471791556300 165 8
2/11/13 5:09:16.000 AM 21242230731515268458 131 / Nothing
2/11/13 5:09:38.000 AM 21242230731515268458 153 22
2/11/13 5:09:46.000 AM 21242230731515268458 161 18
2/11/13 5:09:53.000 AM 21242230731515268458 168 7
2/11/13 5:09:53.000 AM 21242230731515268458 168 0
2/11/13 5:10:19.000 AM 21242230731515268458 194 26
2/11/13 5:10:24.000 AM 21242230731515268458 199 5
2/11/13 5:10:37.000 AM 21242230731515268458 212 13
2/11/13 5:10:45.000 AM 21242230731515268458 220 8
2/11/13 5:10:50.000 AM 21242230731515268458 225 5

I have tried to use the delta command, but it doesn't work because events from several s can occur at the same time.

I want to avoid using double makemv / mvexpand and compute the difference between all and filter after, as it will cost a lot of time (n square) and I really only need the duration(n) - duration(n-1) value

Thanks a lot!

Guilhem

1 Solution

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

View solution in original post

guilhem
Contributor

OK, it was quite "simple", thanks for pointing out the streamstat command, I'm not familiar with it. Anyway here the solution:

| streamstats list(duration) as duration2 c(duration) as count by s
| eval diff = mvindex(duration2, count-2)
| eval diff = duration-diff

Note that I have to count the number of values of the duration2, so I can get the before last mvindex


EDIT

The updated (much simplier) version is:

| streamstats current=f last(duration) as last_duration by s
| eval diff = duration - last_duration

Thanks to martin_mueller

martin_mueller
SplunkTrust
SplunkTrust

list(duration) sounds quite cumbersome when you're basically just looking for last(duration)...

guilhem
Contributor

Be carefull thow, as I have just discovered, the lis() function only returns the first 100 results, so it may be convenient to use something else (penultimate duration value, instead of list).

I will update the answer when I found out how to find the penultimate value of a field using stats.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

guilhem
Contributor

Perfect, using the current=f to get the last-but-one value is clever!

I update the answer.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Consider this to get around the list/mvindex thingy:

... | streamstats current=f last(duration) as last_duration by s | eval diff = duration - last_duration

AshimaE
Explorer

Could you explain what current=f is used for here. Also last(x) takes us to the oldest entry for that s isn't it. so how is it being used here exactly. sorry for the naive doubts. im still a newbie.

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...