Splunk Search

Best way to merge results? (ver 5.0.5)

yuwtennis
Communicator

Hi!

I would like to get an advice for how to merge to results.

I have a search as below.

index=A [
search [ index=A
.....
field a b
]

The parent search takes the field a and b and search indexA again.
However , this is bit slow if I have thousands of result from the subsearch.

As a work-around , I believe you can merge results by either way.

  1. Combination of lookup table and inner join
    index=A [
    search [ index=A
    .....
    fields a b
    outputlookup hoge.csv
    return ""
    ]
    | join type=inner a b [|inputlookup hoge.csv]

  2. Use map
    index=A
    ......
    | fields a b
    | map search="search index=A a=$a$ b=$b$" maxsearches=xxxxxx

Since map command heavily relies on number of lists so I prefer using combination of join and lookuptable.

What will be a best way to merge results?

Thanks,
Yu

Tags (2)
0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

For filtering a search based on a different search's results your first approach usually is best.

Let's make up a realistic example: You have events that form a transaction with some transaction_id... somewhere down the line of that transaction there is a user field, and you want to grab the transactions for user=yuwtennis.
A slow search would go like this:

sourcetype=transactions | transaction transaction_id | search user=yuwtennis

That'll build ALL the transactions and then throw out most of them.

Pre-filtering like this doesn't work if the user field isn't present in every event:

sourcetype=transactions user=yuwtennis | transaction transaction_id

So you'll have to pick out the transaction_id values you need before you build the transaction:

sourcetype=transaction [search sourcetype=transaction user=yuwtennis | dedup transaction_id | fields transaction_id] | transaction transaction_id

That will take a bit more time due to running two searches, but will almost always be miles faster than the first naïve search.

Your workaround #1 looks slow because joining will always be very slow compared to filtering before loading events.
Your workaround #2 is probably going to be worse when as you say there may be thousands of values returned from the subsearch, so the map would have to run thousands of searches - that can't be fast.

View solution in original post

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

For filtering a search based on a different search's results your first approach usually is best.

Let's make up a realistic example: You have events that form a transaction with some transaction_id... somewhere down the line of that transaction there is a user field, and you want to grab the transactions for user=yuwtennis.
A slow search would go like this:

sourcetype=transactions | transaction transaction_id | search user=yuwtennis

That'll build ALL the transactions and then throw out most of them.

Pre-filtering like this doesn't work if the user field isn't present in every event:

sourcetype=transactions user=yuwtennis | transaction transaction_id

So you'll have to pick out the transaction_id values you need before you build the transaction:

sourcetype=transaction [search sourcetype=transaction user=yuwtennis | dedup transaction_id | fields transaction_id] | transaction transaction_id

That will take a bit more time due to running two searches, but will almost always be miles faster than the first naïve search.

Your workaround #1 looks slow because joining will always be very slow compared to filtering before loading events.
Your workaround #2 is probably going to be worse when as you say there may be thousands of values returned from the subsearch, so the map would have to run thousands of searches - that can't be fast.

0 Karma

lguinn2
Legend

I am unclear about why you are going to "merge results"

I can't figure out why you can't simply do the search on index=A and be done. More details are needed to figure out the best approach.

0 Karma
Get Updates on the Splunk Community!

Get ready to show some Splunk Certification swagger at .conf24!

Dive into the deep end of data by earning a Splunk Certification at .conf24. We're enticing you again this ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Now On-Demand Join us to learn more about how you can leverage Service Level Objectives (SLOs) and the new ...

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...