Splunk Search

How to combine rows in groups of 2, grouped by nearest time?

JJ_Yam
Explorer

Title may be a bit confusing, so here's an example of what I'm trying to achieve:

I want to convert a table that looks like this:

_timeuseraction
2022-01-01 10:00:00user_1login
2022-01-01 10:00:10user_2login
2022-01-01 11:30:20user_1logout
2022-01-01 11:40:00user_1login
2022-01-01 12:00:00user_1logout
2022-01-01 12:01:00user_2logout

 

Into this:

userlogin_timelogout_time
user_12022-01-01 10:00:002022-01-01 11:30:20
user_22022-01-01 10:00:102022-01-01 12:01:00
user_12022-01-01 11:40:002022-01-01 12:00:00

 

Labels (3)
0 Karma
1 Solution

johnhuang
Motivator

This looks like a good use case for transactions.

 

<base_search>
| transaction user startswith=(action="login") endswith=(action="logout")
| eval login_time=strftime(_time, "%Y-%m-%d %H:%M:%S") 
| eval logout_time=strftime(_time+duration, "%Y-%m-%d %H:%M:%S")
| table user login_time logout_time

View solution in original post

johnhuang
Motivator

This looks like a good use case for transactions.

 

<base_search>
| transaction user startswith=(action="login") endswith=(action="logout")
| eval login_time=strftime(_time, "%Y-%m-%d %H:%M:%S") 
| eval logout_time=strftime(_time+duration, "%Y-%m-%d %H:%M:%S")
| table user login_time logout_time

bowesmana
SplunkTrust
SplunkTrust

I would postulate that there is no good reason for "transaction" 😀 - although in this case, the example data is small,.

The general issue with transaction for long "lived' transactions is that if you do have a login period of 1 hour 30 minutes, then Splunk has to retain intervening data in memory until it finds the logout - if there is no logout, then you need to define a maximum span for a transaction and then handle open transactions, otherwise you can unpredictable results with large datasets.

Just in case, this is a simple example of a bigger picture, here is a different technique that will do the calculation. The first part sets up your example data, but the work starts at streamstats...

| makeresults
| eval _raw="_time	user	action
2022-01-01 10:00:00	user_1	login
2022-01-01 10:00:10	user_2	login
2022-01-01 11:30:20	user_1	logout
2022-01-01 11:40:00	user_1	login
2022-01-01 12:00:00	user_1	logout
2022-01-01 12:01:00	user_2	logout"
| multikv forceheader=1
| eval _time=strptime(time, "%F %T")
| table _time user action
``` -------- This is the logic to convert the table from here --------```
| streamstats window=2 global=f list(action) as actions min(_time) as min max(_time) as max by user
| where mvcount(mvdedup(actions))=2 AND mvindex(actions, 0)="login"
| eval duration=max-min
| eval login_time=strftime(min, "%F %T")
| eval logout_time=strftime(max, "%F %T")
| sort min
| table user login_time logout_time

 This will handle missing logout (i.e. sequential logins)

 

johnhuang
Motivator

@bowesmana, i was wondering how long it'll take until the transaction police reads this post. 😀

0 Karma

bowesmana
SplunkTrust
SplunkTrust

🚓🚓🚓🚓🚓🚓🚓🚓🚓

JJ_Yam
Explorer

Hey, thanks for the answer! However, the line:

| eval logout_time=strftime(_time+duration, "%Y-%m-%d %H:%M:%S")

Doesn't work for my case as the logs do not include a duration field. Instead, there are 2 _time values, 1 for login time, and 1 for logout time. Do you know how I could turn these 2 values into separate columns - login_time and logout_time, as shown in the question?

0 Karma

johnhuang
Motivator

The duration is a field that is automatically calculated by the transaction. By default, the transaction command will take the start and end time to calculate the duration in seconds. The _time value represents the begining of the transaction which is the user logon event.

Are you actually having issues?

0 Karma

JJ_Yam
Explorer

Apologies, I just realized the duration field exists. Thank you!

0 Karma
Get Updates on the Splunk Community!

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...