Splunk Search

Looping through the avg_distance series with a lookback of 20 periods to find the Mean Absolute Deviation

w344423
Explorer

Hi all Ninja's i need some help here to find this calculation which can be done easily in excel but i wanted to convert to SPL. I have tried to use streamstats but and autoregress but unable to figure out how to do it.

Problem statement:

How do I convert the calculation for mean deviation. 

below formula is applied in col : j

explanation of the formula, calculation starts at count of 20, it takes the value of col : i and subtract col : h once it done with its own row take the same i value and subtract from "count - 1" after it finishes 20 period.

hopefully someone could really help as this part of calculation has me stuck for a few days which i unable to submit this to my boss for system migration to splunk. 

=(ABS(I22-H22)+ABS(I22-H21)+ABS(I22-H20)+ABS(I22-H19)+ABS(I22-H18)+ABS(I22-H17)+ABS(I22-H16)+ABS(I22-H15)+ABS(I22-H14)+ABS(I22-H13)+ABS(I22-H12)+ABS(I22-H11)+ABS(I22-H10)+ABS(I22-H9)+ABS(I22-H8)+ABS(I22-H7)+ABS(I22-H6)+ABS(I22-H5)+ABS(I22-H4)+ABS(I22-H3))/20

 

     col : hcol : icol : j
countdatestoptopbottomavg_distancemoving_avg_20_periodMean Deviation
12022-08-012.292.312.2652.2883  
22022-08-022.312.372.282.32  
32022-08-032.272.362.242.29  
42022-08-042.352.362.262.3233  
52022-08-052.442.572.342.45  
62022-08-082.492.512.412.47  
72022-08-092.412.522.352.4267  
82022-08-102.512.592.382.4933  
92022-08-112.492.51762.432.4792  
102022-08-122.382.52.262.38  
112022-08-152.352.412.322.36  
122022-08-162.392.412.312.37  
132022-08-172.312.382.32.33  
142022-08-182.282.322.232.2767  
152022-08-192.112.322.0552.1617  
162022-08-222.082.192.072.1133  
172022-08-232.022.1051.922.015  
182022-08-242.012.061.942.0033  
192022-08-252.012.071.871.9833  
202022-08-262.022.061.932.00332.27690.13814
212022-08-292.022.041.962.00672.26280.15529
222022-08-302.2052.2222.14172.25390.160265
232022-08-312.092.252.072.13672.24620.16509
242022-09-011.922.161.9222.230.173545
252022-09-021.861.991.831.89332.20220.1766
262022-09-061.81.861.731.79672.16850.176745
272022-09-071.841.851.751.81332.13790.175175
282022-09-081.71.851.6651.73832.10010.174805
292022-09-091.751.761.631.71332.06180.17136
Labels (2)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

Here is your data in a Splunk example. It gives very slightly different values due to rounding

| makeresults
| eval _raw="count	date	stop	top	bottom	avg_distance	moving_avg_20_period	Mean_Deviation
1	2022-08-01	2.29	2.31	2.265	2.2883	 	 
2	2022-08-02	2.31	2.37	2.28	2.32	 	 
3	2022-08-03	2.27	2.36	2.24	2.29	 	 
4	2022-08-04	2.35	2.36	2.26	2.3233	 	 
5	2022-08-05	2.44	2.57	2.34	2.45	 	 
6	2022-08-08	2.49	2.51	2.41	2.47	 	 
7	2022-08-09	2.41	2.52	2.35	2.4267	 	 
8	2022-08-10	2.51	2.59	2.38	2.4933	 	 
9	2022-08-11	2.49	2.5176	2.43	2.4792	 	 
10	2022-08-12	2.38	2.5	2.26	2.38	 	 
11	2022-08-15	2.35	2.41	2.32	2.36	 	 
12	2022-08-16	2.39	2.41	2.31	2.37	 	 
13	2022-08-17	2.31	2.38	2.3	2.33	 	 
14	2022-08-18	2.28	2.32	2.23	2.2767	 	 
15	2022-08-19	2.11	2.32	2.055	2.1617	 	 
16	2022-08-22	2.08	2.19	2.07	2.1133	 	 
17	2022-08-23	2.02	2.105	1.92	2.015	 	 
18	2022-08-24	2.01	2.06	1.94	2.0033	 	 
19	2022-08-25	2.01	2.07	1.87	1.9833	 	 
20	2022-08-26	2.02	2.06	1.93	2.0033	2.2769	0.13814
21	2022-08-29	2.02	2.04	1.96	2.0067	2.2628	0.15529
22	2022-08-30	2.205	2.22	2	2.1417	2.2539	0.160265
23	2022-08-31	2.09	2.25	2.07	2.1367	2.2462	0.16509
24	2022-09-01	1.92	2.16	1.92	2	2.23	0.173545
25	2022-09-02	1.86	1.99	1.83	1.8933	2.2022	0.1766
26	2022-09-06	1.8	1.86	1.73	1.7967	2.1685	0.176745
27	2022-09-07	1.84	1.85	1.75	1.8133	2.1379	0.175175
28	2022-09-08	1.7	1.85	1.665	1.7383	2.1001	0.174805
29	2022-09-09	1.75	1.76	1.63	1.7133	2.0618	0.17136
"
| multikv forceheader=1
| table count	date	stop	top	bottom	avg_distance	moving_avg_20_period	"Mean_Deviation"
``` FROM HERE ```
| streamstats window=20 sum(avg_distance) as coli_moving_average list(avg_distance) as avg_distance_values
| eval coli_moving_average=if(count>=20,coli_moving_average/20, null())
| eval deviation=if(count>=20, mvmap(avg_distance_values, round(abs(coli_moving_average-avg_distance_values),4)), null())
| eval mean_deviation=avg(deviation)
| fields - deviation avg_distance_values

 but the part you want is from streamstats

Hope this helps.

The streamstats is key in that it calculates the moving average and also collects the list of 20 values into the 'current' row.

Then it uses mvmap to do the calcs of the absolute differences.

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

Here is your data in a Splunk example. It gives very slightly different values due to rounding

| makeresults
| eval _raw="count	date	stop	top	bottom	avg_distance	moving_avg_20_period	Mean_Deviation
1	2022-08-01	2.29	2.31	2.265	2.2883	 	 
2	2022-08-02	2.31	2.37	2.28	2.32	 	 
3	2022-08-03	2.27	2.36	2.24	2.29	 	 
4	2022-08-04	2.35	2.36	2.26	2.3233	 	 
5	2022-08-05	2.44	2.57	2.34	2.45	 	 
6	2022-08-08	2.49	2.51	2.41	2.47	 	 
7	2022-08-09	2.41	2.52	2.35	2.4267	 	 
8	2022-08-10	2.51	2.59	2.38	2.4933	 	 
9	2022-08-11	2.49	2.5176	2.43	2.4792	 	 
10	2022-08-12	2.38	2.5	2.26	2.38	 	 
11	2022-08-15	2.35	2.41	2.32	2.36	 	 
12	2022-08-16	2.39	2.41	2.31	2.37	 	 
13	2022-08-17	2.31	2.38	2.3	2.33	 	 
14	2022-08-18	2.28	2.32	2.23	2.2767	 	 
15	2022-08-19	2.11	2.32	2.055	2.1617	 	 
16	2022-08-22	2.08	2.19	2.07	2.1133	 	 
17	2022-08-23	2.02	2.105	1.92	2.015	 	 
18	2022-08-24	2.01	2.06	1.94	2.0033	 	 
19	2022-08-25	2.01	2.07	1.87	1.9833	 	 
20	2022-08-26	2.02	2.06	1.93	2.0033	2.2769	0.13814
21	2022-08-29	2.02	2.04	1.96	2.0067	2.2628	0.15529
22	2022-08-30	2.205	2.22	2	2.1417	2.2539	0.160265
23	2022-08-31	2.09	2.25	2.07	2.1367	2.2462	0.16509
24	2022-09-01	1.92	2.16	1.92	2	2.23	0.173545
25	2022-09-02	1.86	1.99	1.83	1.8933	2.2022	0.1766
26	2022-09-06	1.8	1.86	1.73	1.7967	2.1685	0.176745
27	2022-09-07	1.84	1.85	1.75	1.8133	2.1379	0.175175
28	2022-09-08	1.7	1.85	1.665	1.7383	2.1001	0.174805
29	2022-09-09	1.75	1.76	1.63	1.7133	2.0618	0.17136
"
| multikv forceheader=1
| table count	date	stop	top	bottom	avg_distance	moving_avg_20_period	"Mean_Deviation"
``` FROM HERE ```
| streamstats window=20 sum(avg_distance) as coli_moving_average list(avg_distance) as avg_distance_values
| eval coli_moving_average=if(count>=20,coli_moving_average/20, null())
| eval deviation=if(count>=20, mvmap(avg_distance_values, round(abs(coli_moving_average-avg_distance_values),4)), null())
| eval mean_deviation=avg(deviation)
| fields - deviation avg_distance_values

 but the part you want is from streamstats

Hope this helps.

The streamstats is key in that it calculates the moving average and also collects the list of 20 values into the 'current' row.

Then it uses mvmap to do the calcs of the absolute differences.

 

w344423
Explorer

OMG now i learn something new using mvmap command. thank you so much. 

Tags (1)
0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...