Hi,
So my task is to extract a field from a query and search for that field. That query will give an object value as a string and want to extract data from there.
In summary, I need 3 things
1. plain query to get the data and extract a particular field.
2. Use that field as an input for the second query.
3. Get object data as a string as a result, extract fields from there, and generate a report from it in tabular format.
I was able to reach till 1st step and extract the field from it. but I am unable to search for it.
below is the query I tried.
sourcetype="mykube.source" "failed request" | rex "failed request:(?<request_id>[\w-]+)" | table request_id | head 1 | eval req_query = request_id | search req_query
if I try till `head 1` I get first request_id but after that result is empty for me.
sorry for my typo, it happened while editing atual content. I did exactly with one `=` but still no result
You need to start breaking it down to find where the problem is - start with just
sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)"
Do you get any results/values in request_id?
Hi @ITWhisperer @PickleRick
yes, the sample event I gave you earlier, it was output of that event only whole command works as main search but if I put same thing as subsearch, it doesn't work
Hi @ITWhisperer @PickleRick ,
Is there any alternate way to perform this task?
Hi @ITWhisperer , @PickleRick
I was playing around further, I ran below query
search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id==(?<request_id>[\w-]+)" | | stats values(request_id) as request_ids | format
this query gave me output as
( ( ( request_ids="10a-b-0m” OR request_ids="10a-b-0m” OR request_ids="10a-bn-10m” OR request_ids="10a-b-8m” OR request_ids="10a-b-6m” OR request_ids="10a-b-3c“ OR request_ids="10a-b-3cw” OR request_ids="10a-bv” OR request_ids="10a-b-0m” OR request_ids="10a-b-09m” OR request_ids="10a-b-m9” OR request_ids="10a-bb-4c” OR request_ids="10a-b" OR request_ids="10a-e OR request_ids="101v-n” OR request_ids="10a-c” OR request_ids=“10a-b” ) ) )
but again same thing happened, if I use this as subquery it is not working. I think my main query is searching like below
request_ids="10a-b-0m” OR request_ids="10a-b-0m”
instead of
"10a-b-0m” OR "10a-b-0m”
what could be the solution?
This is a long thread but some of the answers you need are there, two key ones are these:
If your subsearch is using more than 50k events, your results will be compromised
Rename the request_id field as search
search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id==(?<search>[\w-]+)" | | stats count by search | fields search
Subsearches are limited to 50,000 events. How many events are returned by
sourcetype="my_source" "failed request, request id="
267 events only
I did a quick and dirty test
| makeresults count=10
| streamstats count
| eval size=count*10000
| map search="| makeresults count=100000 | streamstats count | search [ | makeresults count=$size$ | streamstats count | tail 1 | table count] "
And got 10 results from 10k to 100k. So apparently even though the initial makeresults creates 100k events, the important thing is that the subsearch only returns one result.
But if I rephrase the search a bit
| makeresults count=10
| streamstats count
| eval size=count*10000
| map search="| makeresults count=100000 | streamstats count | search [ | makeresults count=$size$ | streamstats count | table count] | stats count
All my results are 10000, because all subsearches are finalized at 10k results.
@PickleRick You could be right - I thought I had a usecase where it was events being found caused a problem, but I can't reproduce it, although I can reproduce the problem when it is results being returned exceeding (or rather being truncated at) 50k
Ok, firstly, we seem to be mixing limits. 50000 is the default limit for subsearch used by join command. The limit for subsearch is 10000 results.
But as I understand the wording from the limits.conf spec, it applies to the number of results returned by the search, not the initial events processed by the first part of the pipeline.
I'll have to test it.
The limit is to do with the events not the result i.e. the number of events returned by the first part of the subsearch (before first pipe), so, as you have already stated, you had more than 50k events to get your 250+ results. You need to reframe this initial part of the subsearch so that fewer than 50k events are found.
See https://docs.splunk.com/Documentation/Splunk/latest/Search/Changetheformatofsubsearchresults
The format command is implicitly added by Splunk at the end of the subsearch if it's not explicitly put there by you (you might want to explicitly override the format in which Splunk returns the data from the subsearch; but usually it's ok as it is).
So if you want to see what the data from your subsearch will be rendered as when returned from the subsearch, you can use the format command to see it as a resulting string value. 🙂
So if the string containings the resulting set of conditions is OK (and works properly when literally copy-pasted to your original search), you must be hitting some limit when running the search as subsearch (It's apparently not the result count limit if you're getting only 250 or so rows so it's probably the time limit for the subsearch).
Hi @PickleRick,
The one you gave worked and I got all the request ids
Can you please explain how format worked out?
OK. So do
search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | fields request_id | rename request_id as search | format
And substitute manually your subsearch with the results from this one.
Hi @PickleRick ,
sorry for that typo
search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | fields request_id | rename request_id as search
above is the correct one
OK. this thread is soooo long 😉
The search you just posted has definitely at least one error - there is no command called "request_id". It's your field's name. So either you copy-pasted it wrong or it's not gonna work at all.
Hi @PickleRick,
Yes I have that.
If you scroll on above conversation, I have pasted the result.
Do you want me to post the result of below query?
search sourcetype="my_source" "failed request, request id=" | rex “failed request, request id=(?<request_id>[\w-]+)" | request_id | fields request_id | rename request_id as search
I'm asking whether you already have this field extracted. Because that was your problem before.
Hi @PickleRick,
Yes I have that data.
Basically, right now issue is with sub search I am getting results than actual result present and I am not able to understand why
Again my 3 cents - do you have the field called request_ids in your data? Because that's what your subsearch will generate a condition for.
And you don't need an explicit format command if you're not overriding any default options for it.