Splunk Search

Join not working

av_
Path Finder

I'm trying to use an outer join but I am not getting the desired output. Looks like the query in the left has less events than the sub search query.  Could that be the reason for outer join not working.
I can't use STATS because both the queries have multiple indexes & sourcetypes. 


Labels (1)
0 Karma

av_
Path Finder

@richgalloway @ITWhisperer Here's the query:

(index=app* (app=Application source="abc" "eventName=what is your name" *className IN (first*,second*,third*,fouth*))
OR (app=Application1 sourcetype="music:pqr" source="music/pqr.log" "Random raw msf" "status=COMPLETED" *className IN (first*,second*,third*,fouth*))
OR (source="xyz/eventmanagement/eventmanagement.log" "messages from _raw" name=my_amazon_order OR name=my_shiprocket_order *className IN (first*,second*,third*,fouth*)))
OR (app=Application2 "raw message" source="aaa/orderdetailsave/orderdetailsave.log" **className IN (first*,second*,third*,fouth*))
earliest=$time.earliest$ latest=$time.latest$
| dedup field1
| eval component="FirstComp"
| join field1 type=outer
[ search index=index1 index1=main sourcetype="log4j:*" source="/var/log/*/random.log" host="host1*"

| dedup field1
| eval component= "secondcomp"
| eval field2=field1
]
| where isnull(field2)
| table field1


The problem statement is that Fist component has 1000 events whereas second component has 2000 events. While using inner join, both the components have 1000 common events. While using left join, the result should be 0 but I'm getting those 1000 events which are visible while using inner join.

Also, query structure needs to be the same due to some prior JS changes. 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Using left join, you should get 1000 events from the first part of the search (left and outer mean the same thing). The where command would strip out events which didn't match, but you already said that the 1000 from the first / left side of the join match with 1000 from the second / right side of the join., so I would not expect it to remove any events.

0 Karma

av_
Path Finder

@ITWhisperernot expect it to remove any events mean that I should be able to see 1000 results after loading the query or (0 results) no results because the events are matching perfectly. 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

So, what are you saying is not working, are you getting no results when you were expect some or results when you were expecting none?

Either way, it sounds like a problem with your data not being as you expected. Are the fields extracted as you expected? Do you have any "hidden" blank spaces which cause the join to give unexpected results?

You could try including the "inner" search in the main search and set a field based on whether the event can be identified as coming from the inner search or not e.g.

| eval inner=if(index="index1", "true", "false")
| stats values(inner) as innerouter by field1

If innerouter has both "true" and "false" then value in field1 appeared in the inner search and at least one of the outer searches, etc.

0 Karma

av_
Path Finder

I’m expecting 0 results because the 1000 events in the left query are a subset of the 2000 events in the right query. 
But in real I’m getting 1000 events on using left join which seems incorrect. 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

As I suggested, it might be your data because the way you appear to be doing it should work.

Can you identify values of field1 which should have joined which don't appear to have joined?

Also, bear in mind that sub-search (as used by your inner search on the join) are limited to 50,000 events so it could be that the missing inner events have fallen outside the 50k limit. Try reducing the timeframe for you search to see if there is a point at which you get the results you expect.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If the left side is a subset of the right side then the left side will be the result of a Left Join.

---
If this reply helps you, Karma would be appreciated.

ITWhisperer
SplunkTrust
SplunkTrust

Just because events have multiple indexes and source types does not mean you can't use stats to correlate events in the events pipeline. In addition to @richgalloway's request, please also share some sample representative anonymised events showing how you would like these events to correlated.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please share the query.

Please tell us what "not working" means.  What results do you get and how do those results not meet expectations?

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...