Splunk Search

How to optimize the query that hits disk usage when computing with stats and percentage?

Veeru
Path Finder
 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time
| stats count(eval(promotion="OFFERED")) AS Offers count(eval(promotion="ACCEPTED")) AS Redeemed by _time point=
| eval Take_Rate_Percent=((Redeemed)/(Offers)*100)
| eval Take_Rate_Percent=round(Take_Rate_Percent,2)



this search is  running for 15 min but when i search for more than 15 min it is giving search suspened due to huge data. Please help me to optimize the query.

Thank you in advance
veeru

Labels (3)
0 Karma

bowesmana
SplunkTrust
SplunkTrust

What is your time range?

How many values(promotion) and values(point) do you expect to have

What is the cardinality of global?

Have you looked at the job inspector to see where the time is being spent?

 

0 Karma

Veeru
Path Finder

So my time range is more than 15 days.but issue is for last 24 hours i'm having more than 4 lakh events.
i want to optimize search to run dashboard panel fast.

Tags (1)
0 Karma

Veeru
Path Finder

@bowesmana 

when i run for  last 7 days
This search has completed and has returned 337 results by scanning 49,396,521 events in 539.528 seconds.
i want to optimize it to less sec
 stats taking more time can you please help me to take alternative for this.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

So when you run the first part of the search

 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time

how many values(point) and values(promotion) do you get per global/_time and how what is the number of rows - if you run this for 24 hours?

 

0 Karma

Veeru
Path Finder

@bowesmana 
For 

| stats values(point) as point values(promotion) as promotionAction BY global  _time

i'm getting  6,446,807 results by scanning 6,773,378 events in 78.521 seconds
for  4 rows

globalOpId _time pointBankCode promotionAction

00000162022-08-22 19:00:00  
000003b2022-08-22 14:00:00  
00000bb42022-08-22 07:00:00 ACCEPTED
00000c412022-08-22 05:30:00  
000011362022-08-22 21:00:00  
000015e72022-08-22 14:30:00  

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

So, for a single globalOpId, in your example 0000016, does that mean the list of values is very large for that single row? The table you show has 6 rows, you state 4 rows - can you clarify that you a row in your table means a row you talk about.

Can you state how many values(point) you have for a SINGLE global code and _time - what sort of count of values do you have?

If you have several million point values for each row, then that is why it is so slow.

Can you clarify what you are trying to do. If your point cardinality is very high, you should not collect values(point) and then split them out again.

Without knowing your data, can you do the first stats by global point _time rather than just global _time and then see if you can work out what your calculations from that data.

 

 

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...