Re: Optimize the query that hits disk usage when c...

Veeru · ‎08-22-2022

 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time
| stats count(eval(promotion="OFFERED")) AS Offers count(eval(promotion="ACCEPTED")) AS Redeemed by _time point=
| eval Take_Rate_Percent=((Redeemed)/(Offers)*100)
| eval Take_Rate_Percent=round(Take_Rate_Percent,2)

this search is running for 15 min but when i search for more than 15 min it is giving search suspened due to huge data. Please help me to optimize the query.

Thank you in advance
veeru

bowesmana · ‎08-22-2022

What is your time range?

How many values(promotion) and values(point) do you expect to have

What is the cardinality of global?

Have you looked at the job inspector to see where the time is being spent?

Veeru · ‎08-22-2022

So my time range is more than 15 days.but issue is for last 24 hours i'm having more than 4 lakh events.
i want to optimize search to run dashboard panel fast.

Veeru · ‎08-22-2022

@bowesmana

when i run for last 7 days
This search has completed and has returned 337 results by scanning 49,396,521 events in 539.528 seconds.
i want to optimize it to less sec
stats taking more time can you please help me to take alternative for this.

bowesmana · ‎08-23-2022

So when you run the first part of the search

 index=A host="bd*" OR host="p*" source="/apps/logs/*"
| bin _time span="30m"
| stats values(point) as point values(promotion) as promotionAction BY global  _time

how many values(point) and values(promotion) do you get per global/_time and how what is the number of rows - if you run this for 24 hours?

Veeru · ‎08-23-2022

@bowesmana
For

| stats values(point) as point values(promotion) as promotionAction BY global  _time

i'm getting 6,446,807 results by scanning 6,773,378 events in 78.521 seconds
for 4 rows

globalOpId _time pointBankCode promotionAction

0000016	2022-08-22 19:00:00
000003b	2022-08-22 14:00:00
00000bb4	2022-08-22 07:00:00	ACCEPTED
00000c41	2022-08-22 05:30:00
00001136	2022-08-22 21:00:00
000015e7	2022-08-22 14:30:00

bowesmana · ‎08-23-2022

So, for a single globalOpId, in your example 0000016, does that mean the list of values is very large for that single row? The table you show has 6 rows, you state 4 rows - can you clarify that you a row in your table means a row you talk about.

Can you state how many values(point) you have for a SINGLE global code and _time - what sort of count of values do you have?

If you have several million point values for each row, then that is why it is so slow.

Can you clarify what you are trying to do. If your point cardinality is very high, you should not collect values(point) and then split them out again.

Without knowing your data, can you do the first stats by global point _time rather than just global _time and then see if you can work out what your calculations from that data.

How to optimize the query that hits disk usage when computing with stats and percentage?

eval

fields

stats

Detecting Remote Code Executions With the Splunk Threat Research Team

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

.conf24 | Session Scheduler is Live!!