Solved: Re: Not getting results from JOIN

Nicholas_Key · ‎08-18-2010

Hi all,

There are two datacubes that I want to perform a join operation.

The first search string looks like this:

index="vmware" sourcetype="esxtop_Group_Cpu" pct_CoStop | rex field=_raw "id=[\d]+:(?P<VM>.*)" | fields VM 

Wed Aug 18 11:33:55 PDT 2010 pct_CoStop=0.00, id=22:LisaSplunk4VMware
Wed Aug 18 11:33:55 PDT 2010 pct_CoStop=0.01, id=21:vCenter

The second search string looks like this:

index="vmware" sourcetype="VM_Inventory" | head 1 | multikv fields DATACENTER CLUSTER VMHOST VM | table DATACENTER CLUSTER VMHOST VM

DATACENTER  CLUSTER     VMHOST  VM
SF  Intel-Hosts 10.1.6.34   perfVMFS
SF  Intel-Hosts 10.1.6.34   NicholasVMTest
SF  Intel-Hosts 10.1.6.34   Win2003_x86_template
SF  Intel-Hosts 10.1.6.34   LisaSplunk4VMware
SF  Intel-Hosts 10.1.6.34   Support_vm_debian
SF  Intel-Hosts 10.1.6.34   JMW Ubuntu
SF  Intel-Hosts 10.1.6.34   vCenter
SF  Intel-Hosts 10.1.6.23   perfRaw
SF  AMD-Hosts   10.1.12.5   Windows_2k3_64bit
SF  AMD-Hosts   10.1.12.5   SUDAENGW2008
SF  AMD-Hosts   10.1.12.5   Windows_XP_JPN
SF  AMD-Hosts   10.1.12.5   Windows_XP
SF  AMD-Hosts   10.1.12.5   Windows_XP_dev
SF  AMD-Hosts   10.1.12.5   Windows_2K_i386
SF  AMD-Hosts   10.1.12.5   Splunk4VMWare
SF  AMD-Hosts   10.1.12.4   OpenSuse_10_x86_64
SF  AMD-Hosts   10.1.12.4   CentOS_3.9_i386
SF  AMD-Hosts   10.1.12.4   OpenSuse_10_i386
SF  AMD-Hosts   10.1.12.4   Windows_Vista_64bit
SF  AMD-Hosts   10.1.12.4   Solaris10_x86_64
SF  AMD-Hosts   10.1.12.4   CentOS_5.3_x84_64
SF  AMD-Hosts   10.1.12.4   LiveCD2
SF  AMD-Hosts   10.1.12.4   CentOS_3.9_x86_64
SF  AMD-Hosts   10.1.12.4   CentOS_5.1_i386
SF  AMD-Hosts   10.1.12.4   Ubuntu_8.0.4_x86_64
SF  AMD-Hosts   10.1.12.4   Windows_2k8_32bit
SF  AMD-Hosts   10.1.12.4   FreeBSD_6.4_x86_64
SF  AMD-Hosts   10.1.12.4   LiveCD1
SF  AMD-Hosts   10.1.12.4   Windows_2K8_64bit_JPN
SF  AMD-Hosts   10.1.12.4   VMware Infrastructure Management Assistant
SF  AMD-Hosts   10.1.12.4   CentOS_4.6_x86_64
SF  AMD-Hosts   10.1.12.4   CentOS_5.1_x84_64
SF  AMD-Hosts   10.1.12.4   CentOS_4.6_i386
SF  AMD-Hosts   10.1.12.4   Ubuntu_8.0.4_i386
SF  AMD-Hosts   10.1.12.4   Windows_2k3_32bit
SF  AMD-Hosts   10.1.12.4   LiveCD3

However, I'm not getting any results from the join operation that I use in this search string.

index="vmware" sourcetype="esxtop_Group_Cpu" pct_CoStop | rex field=_raw "id=[\d]+:(?P<VM>.*)" | fields VM pct_CoStop | join VM [search index="vmware" sourcetype="VM_Inventory" | head 1 | multikv fields DATACENTER CLUSTER VMHOST VM | table DATACENTER CLUSTER VMHOST VM]

Any thoughts? Is multikv working at all in this scenario?

gkanapathy · ‎08-18-2010

So, first of all, if this is your own output from the VM_Inventory script (I'm guessing it is because it looks like a reworking of this answer.) I would recommend this not be the format you use. Write each item out as a separate event instead of a giant table that you have to separate out with multikv. multikv is convenient only because so many unix commands provide output that's inconvenient to Splunk, but you should avoid creating stuff like that. Why? Well, for example, if I wanted to report on one specific VM by name, it would be a pain. I can't use automatic lookups against this data effectively. It does have the advantage I suppose that you can efficiently get the latest version of the full list with "head 1" though.

Alternatively, you might consider pushing this data into a lookup table, depending how often it updates and what you're trying to do with it. It seems to me that at least some of your use case would be well-served by taking the inventory data and periodically writing a lookup table (either directly with your script, or via a scheduled job that took the indexed inventory data and did outputlookup on it.)

Anyway, to get to your immediate question, I don't know. If the fields actually come out of the individual queries correctly, then you should be getting two results out of your join. And you're getting zero results.

View solution in original post

gkanapathy · ‎08-18-2010

So, first of all, if this is your own output from the VM_Inventory script (I'm guessing it is because it looks like a reworking of this answer.) I would recommend this not be the format you use. Write each item out as a separate event instead of a giant table that you have to separate out with multikv. multikv is convenient only because so many unix commands provide output that's inconvenient to Splunk, but you should avoid creating stuff like that. Why? Well, for example, if I wanted to report on one specific VM by name, it would be a pain. I can't use automatic lookups against this data effectively. It does have the advantage I suppose that you can efficiently get the latest version of the full list with "head 1" though.

Alternatively, you might consider pushing this data into a lookup table, depending how often it updates and what you're trying to do with it. It seems to me that at least some of your use case would be well-served by taking the inventory data and periodically writing a lookup table (either directly with your script, or via a scheduled job that took the indexed inventory data and did outputlookup on it.)

Anyway, to get to your immediate question, I don't know. If the fields actually come out of the individual queries correctly, then you should be getting two results out of your join. And you're getting zero results.

Nicholas_Key · ‎08-18-2010

It appears that the raw values of CLUSTER, VMHOST and VM have a leading whitespace character. The solution is to add a trim()

Not getting results from JOIN

Detecting Remote Code Executions With the Splunk Threat Research Team

Observability | Use Synthetic Monitoring for Website Metadata Verification

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk