Hi
I have a vast data set with a sample as below. Need to group the data based on three columns latest timestamp data and get the fourth column value against the latest timestamp found for that grouped data.
Deployed_Data_time | env | app | version |
4/16/2024 15:29 | axe1 | app1 | v-228 |
4/16/2024 15:29 | axe1 | app1 | v-228 |
9/15/2023 8:12 | axe1 | app1 | v-131 |
9/15/2023 8:05 | axe2 | app1 | v-120 |
9/12/2023 1:19 | axe2 | app1 | v-128 |
4/16/2024 15:29 | axe2 | app2 | v-628 |
4/16/2024 15:26 | axe2 | app2 | v-626 |
9/15/2023 8:12 | axe2 | app2 | v-531 |
9/15/2023 8:05 | axe1 | app2 | v-530 |
9/12/2023 1:19 | axe1 | app2 | v-528 |
and I need the output as
app | axe1 | axe2 |
app1 | v-228 | v-120 |
app2 | v-530 | v-628 |
And I tried something as below but output is not as expected.
index=*.log source=*Report* | eval latestDeployed_version=Deployed_Data_time."|".version | eval latestVersion=Deployed_Data_time."|".version | stats latest(Deployed_Data_time) AS Deployed_Data_time values(env) AS env
max(latestVersion) AS latestVersion BY app | rex field=latestVersion "[\|]+(?<version>.*)" | table app,version,env | chart values(version) by app, env limit=0
| fillnull value="Not Deployed"
Please help me achieve this .
Thanks
@NathanAsh You're right! Then use the strptime() example I mentioned and the latest() function. You don't seem to need _time so just convert Deployed_data_time to _time and you can use latest(version)
| makeresults format=csv data="Deployed_Data_time,env,app,version
4/16/2024 15:29,axe1,app1,v-228
4/16/2024 15:29,axe1,app1,v-228
9/15/2023 8:12,axe1,app1,v-131
9/15/2023 8:05,axe2,app1,v-120
9/12/2023 1:19,axe2,app1, v-128
4/16/2024 15:29,axe2,app2,v-628
4/16/2024 15:26,axe2,app2,v-626
9/15/2023 8:12,axe2,app2,v-531
9/15/2023 8:05,axe1,app2,v-530
9/12/2023 1:19,axe1,app2, v-528"
| eval _time=strptime(Deployed_Data_time, "%m/%d/%Y %H:%M")
| stats latest(version) AS version BY app env
| table app,version,env
| chart values(version) by app, env limit=0
| fillnull value="Not Deployed"
Hi @NathanAsh ,
did you tried the OVER clause in the chart command?
index=*.log source=*Report*
| eval latestDeployed_version=Deployed_Data_time."|".version
| eval latestVersion=Deployed_Data_time."|".version
| stats
latest(Deployed_Data_time) AS Deployed_Data_time
values(env) AS env
max(latestVersion) AS latestVersion
BY app
| rex field=latestVersion "[\|]+(?<version>.*)"
| table app version env
| chart values(version) OVER app BY env limit=0
| fillnull value="Not Deployed"
for more infos see at https://docs.splunk.com/Documentation/Splunk/9.2.1/SearchReference/Chart
Ciao.
Giuseppe
No results displayed, but table returns the same value as my try
Sorry, displays data but output is same as my try earlier, latest version is displayed along all env, not specific to that env is displayed.
anything in that line of thoughts be helpful to achieve this https://community.splunk.com/t5/Splunk-Search/How-to-convert-rows-to-columns/m-p/398009
This won't work as you want
| stats latest(Deployed_Data_time) AS Deployed_Data_time values(env) AS env max(latestVersion) AS latestVersion BY app
latest() function is based on the _time field, so if you want Deployed_Data_time to be _time then you need to evaluate it
| eval _time=strptime(Deployed_Data_time,"%m/%d/%Y %H:%M")
but you also cannot do max(latestVersion) as that is simply doing a numeric comparison on the date, which is a string, so 4/16/2024 is LESS than 9/15/2023 - 4 is less than 9.
If you ever want to do string based date comparisons, you need them to be ISO8601, i.e. YYYY-MM-DD-HH:MM:SS
So, using your example data, is this what you want?
| makeresults format=csv data="Deployed_Data_time,env,app,version
4/16/2024 15:29,axe1,app1,v-228
4/16/2024 15:29,axe1,app1,v-228
9/15/2023 8:12,axe1,app1,v-131
9/15/2023 8:05,axe2,app1,v-120
9/12/2023 1:19,axe2,app1, v-128
4/16/2024 15:29,axe2,app2,v-628
4/16/2024 15:26,axe2,app2,v-626
9/15/2023 8:12,axe2,app2,v-531
9/15/2023 8:05,axe1,app2,v-530
9/12/2023 1:19,axe1,app2, v-528"
| rex field=version "v-(?<v>\d+)"
| stats max(v) AS version BY app env
| table app,version,env
| chart values(version) by app, env limit=0
| fillnull value="Not Deployed"
No , app1 in axe2 value should be 120 not 128, (latest deployed version as per the date timestamp)
@NathanAsh You're right! Then use the strptime() example I mentioned and the latest() function. You don't seem to need _time so just convert Deployed_data_time to _time and you can use latest(version)
| makeresults format=csv data="Deployed_Data_time,env,app,version
4/16/2024 15:29,axe1,app1,v-228
4/16/2024 15:29,axe1,app1,v-228
9/15/2023 8:12,axe1,app1,v-131
9/15/2023 8:05,axe2,app1,v-120
9/12/2023 1:19,axe2,app1, v-128
4/16/2024 15:29,axe2,app2,v-628
4/16/2024 15:26,axe2,app2,v-626
9/15/2023 8:12,axe2,app2,v-531
9/15/2023 8:05,axe1,app2,v-530
9/12/2023 1:19,axe1,app2, v-528"
| eval _time=strptime(Deployed_Data_time, "%m/%d/%Y %H:%M")
| stats latest(version) AS version BY app env
| table app,version,env
| chart values(version) by app, env limit=0
| fillnull value="Not Deployed"
Hi
Thanks, it works for the sample data which I have given, but the actual data I pushed in splunk is not as per the Deployed date timestamp, I have pushed old data (2023 year) lately and new data (2024 ) first. Hence for some columns the results are coming as per the data pushed in to splunk time. Any work around can be applied?
Mmm, that's odd because I use that technique to manipulate _time - If you could find a simple example of _raw data where that is the case - perhaps by limiting the search just to pick up an event of each type - I'd be really interested to see.
If the date format for the 2023 data is not as per the strptime format syntax that would cause a problem as it would be later - that would be my suspicion.
If you can do a simple search for that 2023 data and do this
| eval orig_time=strftime(_time, "%F %T.%Q")
| eval _time=strptime(...)
| table _time orig_time
that may show the difference
Got it remediated by including gcusello suggestion of | eval latestDeployed_version=Deployed_Data_time."|".version and used that field in your stats statement as max value instead of latest. This worked well and validated to be fine. Thanks a lot to both