Splunk Search

Nested query

vihshah
Engager

Hi,

So my task is to extract a field from a query and search for that field. That query will give an object value as a string and want to extract data from there. 

In summary, I need 3 things
1.  plain query to get the data and extract a particular field.
2. Use that field as an input for the second query.
3. Get object data as a string as a result, extract fields from there, and generate a report from it in tabular format.

I was able to reach till 1st step and extract the field from it.  but I am unable to search for it.
below is the query I tried.

sourcetype="mykube.source" "failed request"  | rex "failed request:(?<request_id>[\w-]+)" | table request_id | head 1 | eval req_query = request_id | search req_query

if I try till `head 1` I get first request_id but after that result is empty for me.

Labels (3)
0 Karma

vihshah
Engager

Hi @ITWhisperer ,

Below is my query which returns 250+ events

sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>\”?[\w-]+\”?)” | stats values(request_id) as request_ids  | eval request_ids = "\"" . mvjoin(request_ids, "\" OR \"") . "\"" |  eval request_ids= replace(request_ids,"^request_id=","") | format
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

These searches don't look right - please confirm that they accurately represent what you are actually doing

0 Karma

vihshah
Engager

Below is the query I tried

sourcetype=“my_source” [search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | request_id | fields request_id | rename request_id as search] | table user_id user_name

Here I got only one use id multiple times but when I do normal query like below

sourcetype=“my_source” "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | table request_id 

Here I see more than 250 events

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Again, you aren't really giving me any useful information.

What is your complete search which is not doing as you expect?

How many of each request id are you getting?

How many of each request id were you expecting?

0 Karma

vihshah
Engager

I printed request ids

I see only first one is printing multiple times. Whereas original has more than 250+ request ids

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How do you know you are getting fewer results (than expected)?

Which events are being missed?

Is there a common theme to the missing events?

Does it happen all the time or only with certain timeframes?

What else have you done to investigate the issue?

0 Karma

vihshah
Engager

😄 I understand, what data I can give for better understanding? 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

How do you imagine I am going to be able to determine that?

0 Karma

vihshah
Engager

hey @ITWhisperer ,

I see there is one issue, my number of events are less from my actual query.  why it may happened?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The field name "search" is given special treatment when returned in a subsearch in that the field name is not returned, so instead of the subsearch being ((request_id="valueA") OR (request_id="valueB")), it becomes (("valueA") OR ("valueB")). The same goes for field name "query".

0 Karma

vihshah
Engager

this seems like working, can you please explain how did it work? @ITWhisperer 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

It looks like you have introduced the double equals again!

Try something like this

sourcetype=“my_source” [search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | top limit=100 request_id | fields request_id | rename request_id as search]
0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. I haven't been following this topic very closely but I have a feeling you extracted the field manually.

If so, that makes the search harder because at the time of the initial search you don't have fields extracted yet.

But if it's simply a case of searching for another field than the one you're getting your values from, just do " | rename field1 as field2" at the end of your subsearch and you're all set.

0 Karma

vihshah
Engager

Hi @ITWhisperer ,

below is the search I am trying

search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id==(?<request_id>\”?[\w-]+\”?)” | stats values(request_id) as request_ids  | eval request_ids = "\"" . mvjoin(request_ids, "\" OR \"") . "\"" |  eval request_ids= replace(request_ids,"^request_id=","") | format

 @PickleRick ,

Sorry I did not follow you. so basically my subsearch gives me list of failed request_ids, then that list will act as input to my main search , and gives me main events, I need to extract different fields related to that req id ( i.e. accountId)

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ok. There are three ways of resolving this.

1. Preferred - define extractions for the needed fields. It's most probably not the only time you're gonna be using them.

2. Add the subsearch further down the search pipeline. This is a bad idea because you'd he first extracting the field from all events and filtering the events only after that. Waste of resources.

3. Rework your subsearch so that you manually create a set of conditions to be inserted "as is" into the main search and return that as a single value of a field called "search".

Both latter solutions are overly complicated and/or inefficient so I'd advise you to properly extract the fields in the first place.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Since you have been playing around with the search, which search with the 15 minute timeframe are you currently using?

0 Karma

vihshah
Engager

Hi @ITWhisperer ,

I reduced timeframe to 15 mins, now I have only few thousand events, but stll query is not giving any output

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

As I said, subsearches are limited to 50k events - you have 85k events, so the subsearch is not performing as you are expecting. You either need to limit the events your subsearch uses, e.g. change the timeframe, or rework your whole search so that it doesn't need a subsearch.

0 Karma

vihshah
Engager

Hi @ITWhisperer ,

am not sure if this helps,

I see 2 fields in statistics result,

1. request_ids -- this is empty
2. search result- this is where I see all this request_ids and results

0 Karma

vihshah
Engager

Hi @ITWhisperer ,

playing around further

search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id==(?<request_id>\”?[\w-]+\”?)” | stats values(request_id) as request_ids  | eval request_ids = "\"" . mvjoin(request_ids, "\" OR \"") . "\"" |  eval request_ids= replace(request_ids,"^request_id=","") | format

 this gives me output like below

( ( request_ids="\"0fb1-4a2-a3-b8b\" OR \"0b99-d2-4e\" OR \"0c2-01a0-454-a3-2f3\"" ) )


but still there is `request_ids` so my main query does not work as expect

0 Karma

vihshah
Engager

Hi @ITWhisperer ,
Sorry didn't get you. I see total 267 events matched out of 85k events. I am not sure if this answers your question

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...