Splunk Search

Splunk query to join two searches

asharmaeqfx
Path Finder

Hi Splunkers,

 

I have a complex query to extract the IDs from first search and join it using that to the second search and then calculate the response times

index=xxxml source=module "matrix-v4" NOT host="xyz.dmz" NOT "somefield1" NOT "somefield2" "<nt3:overall-outcome>*</nt3:overall-outcome>" |where isnotnull(AccuiteCode) |xmlkv | eval MSUserid=AccessCode | eval source1="MS" | join ip [search index=xxxml source="/var/log/production.log" urlPath="/com/system*" "/org/system" NOT "somefield1" NOT "somefield2" | where isnotnull(accessCode) | eval ProdUserID=accessCode| eval source2="Prod"] | where source1="MS" AND source2="Prod" | eval responsetime = Latency/1000 | stats count(responsetime) as Requests avg(responsetime) perc50(responsetime) perc60(responsetime) perc70(responsetime) perc80(responsetime) perc90(responsetime) perc95(responsetime) perc99(responsetime) max(responsetime) by date_mday,date_month,rIdentifier,nt3:overall-outcome,accessCode 

This query only returns the first matched content but we have thousands to rows for the first query. It somehow unable to join it.

Kindly advise.

 

Thanks,

Amit

Labels (3)
0 Karma

MuS
Legend

Hi there,

Take a look at this answer https://community.splunk.com/t5/Splunk-Search/How-to-compare-fields-over-multiple-sourcetypes-withou... it provides examples how it can be done. Basically get a common field over all events and use stats to do the final calculations.

Hope this helps ...

cheers, MuS

0 Karma

asharmaeqfx
Path Finder

Does not helps much. Can you suggest like how to perform join in normal scenario.

0 Karma

thambisetty
SplunkTrust
SplunkTrust

can you try using transforming command (like stats, timechart or chart) before performing join 

make sure you have "ip" field available in subsearch and try to remove duplicates. when join is used as subsearch  in SPL, it has 50000 limitation on results it can return . if you can take only distinct results this would improve search and give you expected results otherwise you will end up seeing partial results.

————————————
If this helps, give a like below.
0 Karma

kennetkline
Path Finder

Would help to see like a single record Json of each source type;  This goes back to the one .conf talk; I have done this a lot us stats as stated. 

Here is how I would go about it;  search verbose to try an get to a single record of source you are looking to join.  I am making some assumption based on your search as posted.

Here is how i do this;  Iterative;  write it one line at a time;  Focus on getting a sample of 2 records to join;  (Hint don't use Join;  It was 2018 before I learned to do this; as I was stuck in SQL Mindset;  It took me a long time to deprogram myself from this bad habit)

1.  Write a single search to show two records to join; 
I am assuming you are not masking your intended search and index, and NOT somefield 1 2 is common across both searches:

2.  where (isnotnull)   I have found just say Field=*   (that removes any null records from the results.  Less results is more;  the more explicit to write the search better  (time, index, sourcetype, host, etc)  just don't wildcard a field=*<something> has to end with star for performance  field=<something>* or just field=* (not null)

(index=xxxml ((source=module "matrix-v4" NOT host="xyz.dmz" "<nt3:overall-outcome>*</nt3:overall-outcome>" AccuiteCode=* ) OR (source="/var/log/production.log" urlPath="/com/system*" "/org/system" accessCode=*)) NOT "somefield1" NOT "somefield2")

3.  you then are defining a sourc1 / 2 variable unique to the source type:  (something like this would work)

| eval sourceA=if(match(source,"module"), "MS", if(match(source,"/var/log/production.log"), "Prod", "Invalid"))

4.  What fields do you need to pass through for calculation:  use stats

I use the following all the time   (vaules(field) as field,  earliest(field) as field, first(field) as field, latest(field) as field, last(field) as field.

Also if you have the same field in different sourcetypes   you can do stuff to match in a stats.  I do this one all the time;  stats count(eval(match(sourcetype, "abc"))) as count_abc, etc.  I think can   stats (values(eval(match(sourcetype,"abc"))) as

I am not sure what all fields you need to pass through I believe the nt3 only is in the first source:

values is needed when only one of the two sources will have the data point

| stats values(nt3:overall-outcome) as nt3:overall-outcome, valuest(Latency) as latency by date_mday,date_month,rIdentifier,accessCode

5> add the response time

| eval responsetime = Latency/1000

6.   final calculation

| stats count(responsetime) as Requests avg(responsetime) perc50(responsetime) perc60(responsetime) perc70(responsetime) perc80(responsetime) perc90(responsetime) perc95(responsetime) perc99(responsetime) max(responsetime) by date_mday,date_month,rIdentifier,nt3:overall-outcome,accessCode


7:  put it all together.

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...