Splunk Search

How to optimize the query that hits disk usage when computing with stats and percentage?

Veeru
Path Finder
 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time
| stats count(eval(promotion="OFFERED")) AS Offers count(eval(promotion="ACCEPTED")) AS Redeemed by _time point=
| eval Take_Rate_Percent=((Redeemed)/(Offers)*100)
| eval Take_Rate_Percent=round(Take_Rate_Percent,2)



this search is  running for 15 min but when i search for more than 15 min it is giving search suspened due to huge data. Please help me to optimize the query.

Thank you in advance
veeru

Labels (3)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

What is your time range?

How many values(promotion) and values(point) do you expect to have

What is the cardinality of global?

Have you looked at the job inspector to see where the time is being spent?

 

0 Karma

Veeru
Path Finder

So my time range is more than 15 days.but issue is for last 24 hours i'm having more than 4 lakh events.
i want to optimize search to run dashboard panel fast.

Tags (1)
0 Karma

Veeru
Path Finder

@bowesmana 

when i run for  last 7 days
This search has completed and has returned 337 results by scanning 49,396,521 events in 539.528 seconds.
i want to optimize it to less sec
 stats taking more time can you please help me to take alternative for this.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

So when you run the first part of the search

 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time

how many values(point) and values(promotion) do you get per global/_time and how what is the number of rows - if you run this for 24 hours?

 

0 Karma

Veeru
Path Finder

@bowesmana 
For 

| stats values(point) as point values(promotion) as promotionAction BY global  _time

i'm getting  6,446,807 results by scanning 6,773,378 events in 78.521 seconds
for  4 rows

globalOpId _time pointBankCode promotionAction

00000162022-08-22 19:00:00  
000003b2022-08-22 14:00:00  
00000bb42022-08-22 07:00:00 ACCEPTED
00000c412022-08-22 05:30:00  
000011362022-08-22 21:00:00  
000015e72022-08-22 14:30:00  

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

So, for a single globalOpId, in your example 0000016, does that mean the list of values is very large for that single row? The table you show has 6 rows, you state 4 rows - can you clarify that you a row in your table means a row you talk about.

Can you state how many values(point) you have for a SINGLE global code and _time - what sort of count of values do you have?

If you have several million point values for each row, then that is why it is so slow.

Can you clarify what you are trying to do. If your point cardinality is very high, you should not collect values(point) and then split them out again.

Without knowing your data, can you do the first stats by global point _time rather than just global _time and then see if you can work out what your calculations from that data.

 

 

0 Karma
Get Updates on the Splunk Community!

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...