Splunk Search

How to calculate time difference b/w multiple events and sum for a field

ramnaresh2051
Engager

I have requirement to calculate total time a user has been connected to system, for that I have logs as below which stats that when user connected and disconnected, how can we calculate his total time connected by userId.

 

{
"timestamp": "2024-02-19T20:15:01,544Z",
"message": "Connection created.",
"sessionId": "383c4345",
"operation": "Connect",
"spanId": "578045e3cfdbb598",
"traceId": "578045e3cfdbb598",
"userId": "test12"
}
{
"timestamp": "2024-02-19T21:19:01,544Z",
"message": "Connection disconnected.",
"sessionId": "383c4345",
"operation": "Disconnect",
"spanId": "578045e3cfdbb598",
"traceId": "578045e3cfdbb598",
"userId": "test12"
}
{
"timestamp": "2024-02-19T22:24:01,544Z",
"message": "Connection created.",
"sessionId": "383cxfe45",
"operation": "Connect",
"spanId": "fdcaf3d14e630be3",
"traceId": "fdcaf3d14e630be3",
"userId": "test12"
}
{
"timestamp": "2024-02-19T23:17:01,544Z",
"message": "Connection disconnected.",
"sessionId": "383cxfe45",
"operation": "Disconnect",
"spanId": "fdcaf3d14e630be3",
"traceId": "fdcaf3d14e630be3",
"userId": "test12"
}
{
"timestamp": "2024-02-19T19:45:01,544Z",
"message": "Connection created.",
"sessionId": "7490647a",
"operation": "Connect",
"spanId": "c6cc6afa1a770fc9",
"traceId": "c6cc6afa1a770fc9",
"userId": "test_234"
}
{
"timestamp": "2024-02-19T20:19:01,544Z",
"message": "Connection disconnected.",
"sessionId": "7490647a",
"operation": "Disconnect",
"spanId": "c6cc6afa1a770fc9",
"traceId": "c6cc6afa1a770fc9",
"userId": "test_234"
}
{
"timestamp": "2024-02-19T21:34:01,544Z",
"message": "Connection created.",
"sessionId": "f9cbd2f",
"operation": "Connect",
"spanId": "fdcaf3d14e630be3",
"traceId": "fdcaf3d14e630be3",
"userId": "test_234"
}
{
"timestamp": "2024-02-19T23:20:01,544Z",
"message": "Connection disconnected.",
"sessionId": "f9cbd2f",
"operation": "Disconnect",
"spanId": "fdcaf3d14e630be3",
"traceId": "fdcaf3d14e630be3",
"userId": "test_234"
}

 

 

Have given example for 2 userId test12 and test_234, has connected /disconnected 2 times.  And I am expecting results as below.

UserTotal Time Connected 
test121 hour 59 min
test_2342 hours 20 min

 

 

 

Labels (4)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

Something as simple as this should work

search ...
| stats range(_time) as duration by userId, traceId
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

based on your data example, where traceId indicates the connection boundaries, so stats by the traceId and userId will get the duration for each session and then stats to sum the durations.

If there is no disconnection event, then the range(_time) will be 0, so any active connections will not be counted. If you want to calculate that based on the time of the search you could do this

| stats count min(_time) as start_time range(_time) as duration by userId, traceId
| addinfo
| eval duration=if(duration=0 AND count=1, info_max_time - start_time, duration)
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

which would treat the ending time range of the search to be the ending range of an open connection.

and finally to handle all cases where you may have no start or not end event, try this one

search...
| eval type=if(match(message,"created"),1,2)
| stats count values(type) as types min(_time) as first_event_time range(_time) as duration by userId, traceId
| addinfo
``` Handle created but no disconnect ```
| eval duration=if(duration=0 AND count=1 AND mvcount(types)=1 AND types=1, info_max_time - first_event_time, duration)
``` Handle disconnect but no created ```
| eval duration=if(duration=0 AND count=1 AND mvcount(types)=1 AND types=2, first_event_time - info_min_time, duration)
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

Something as simple as this should work

search ...
| stats range(_time) as duration by userId, traceId
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

based on your data example, where traceId indicates the connection boundaries, so stats by the traceId and userId will get the duration for each session and then stats to sum the durations.

If there is no disconnection event, then the range(_time) will be 0, so any active connections will not be counted. If you want to calculate that based on the time of the search you could do this

| stats count min(_time) as start_time range(_time) as duration by userId, traceId
| addinfo
| eval duration=if(duration=0 AND count=1, info_max_time - start_time, duration)
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

which would treat the ending time range of the search to be the ending range of an open connection.

and finally to handle all cases where you may have no start or not end event, try this one

search...
| eval type=if(match(message,"created"),1,2)
| stats count values(type) as types min(_time) as first_event_time range(_time) as duration by userId, traceId
| addinfo
``` Handle created but no disconnect ```
| eval duration=if(duration=0 AND count=1 AND mvcount(types)=1 AND types=1, info_max_time - first_event_time, duration)
``` Handle disconnect but no created ```
| eval duration=if(duration=0 AND count=1 AND mvcount(types)=1 AND types=2, first_event_time - info_min_time, duration)
| stats sum(duration) as duration by userId
| eval connected=tostring(duration, "duration")

ramnaresh2051
Engager

Thanks a lot @bowesmana, this helped me.

There are some message sent events also there in b/w these connect and disconnect (which I need to calculate how many messages sent for a userId), is there any way to exclude these message sent while calculating  min(_time), in case where no connect was available in logs.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can generally do this type of logic directly in the stats through eval statements and in eval categorisation in the logic prior to the stats for clarity, so to expand on the 'type' logic I used in the earlier example, something like this

| eval type=case(match(message,"created"),1,
                 match(message,"disconnected"),2,
                 match(message,"other_message"),3)
| stats count(eval(if(type<3,_time,null()))) as connection_count 
        count(eval(if(type=3,_time,null()))) as message_count
        values(type) as types 
        min(eval(if(type<3,_time,null()))) as first_event_time
        range(eval(if(type<3,_time,null()))) as duration 
        by userId, traceId
| addinfo
``` Handle created but no disconnect ```
| eval duration=if(duration=0 AND connection_count=1 AND types=1, info_max_time - first_event_time, duration)
``` Handle disconnect but no created ```
| eval duration=if(duration=0 AND connection_count=1 AND types=2, first_event_time - info_min_time, duration)
| stats values(message_count) as message_count sum(duration) as duration by userId

so the type is 1-3 depending on text you want to match and then the count eval statements in the stats count the event types and the time calculations exclude the type=3

This is untested, but hopefully you get the picture. There are probably some optimisations there, but it should do what you need

Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...