Re: How to check if a string is present in a dicti...

anooshac · ‎09-04-2023

I have a CSV file which has a some columns. There is one column named GROUP and the data in that column are in the format {'a':1,'b':2}, there can be any number of strings. There is another column VALUE and the data are in the format {'a','b'}.

I want to check if the strings in VALUE column are present in GROUP column and create a separate column named DATA with the strings not present. I am not sure how to achieve it in Splunk using commands. Can anyone have any suggestions?

Example:

Group Value Data

{'a':'1','b':'2'} {'a','b'}

{'a':1,'b':'2'} {'a'} {'b'}

{'a':1,'b':'2','c':'3'} {'a'} {'b','c'}

There are many columns like these and strings present in GROUP column can be more.

gcusello · ‎09-04-2023

Hi @anooshac,

these seems to be data in json format.

If they aren't in a lookup, you could parse them and store in an index using the INDEXED_EXTRACTIONS=json option of props.conf (https://docs.splunk.com/Documentation/Splunk/9.1.1/Admin/Propsconf).

In this way you extract all the fields and you can use the values for your matches.

Ciao.

Giuseppe

bowesmana · ‎09-04-2023

You can do it like this runnable example with your data - using from the rex statement

| makeresults
| eval _raw="Group                     Value              Data
{'a':'1','b':'2'}            {'a','b'}
{'a':1,'b':'2'}              {'a'}                {'b'}
{'a':1,'b':'2','c':'3'}    {'a'}                {'b','c'}"
| multikv forceheader=1
| table Group Value Data
``` This is your Splunk SPL ```
| rex field=Group max_match=0 "'(?<g>\w)':"
| rex field=Value max_match=0 "'(?<v>\w)'"
| eval Calculated_Data=mvmap(g, if(g!=v, g, null()))
| eval Calculated_Data="{'".mvjoin(Calculated_Data, "','")."'}"
| fields - g v

So, if you have a CSV file with Group and Value in it, then

| inputlookup your_csv.csv
| rex field=Group max_match=0 "'(?<g>\w)':"
| rex field=Value max_match=0 "'(?<v>\w)'"
| eval Data=mvmap(g, if(g!=v, g, null()))
| eval Data="{'".mvjoin(Data, "','")."'}"
| fields - g v

anooshac · ‎09-04-2023

Should the csv file uploaded as lookup file? Can we avoid that?

Also the strings can contain _ as well as they can be capital letters and small letters.. how can i do this? This [A-Za-z_] regular expression is fine?

bowesmana · ‎09-04-2023

Where is your data coming from? Data has to be either

a) In a Splunk index
b) In a lookup in Splunk
c) Part of your search, as in my | makeresults example

anooshac · ‎09-05-2023

Hi @bowesmana the data comes from Splunk index and it is csv file

bowesmana · ‎09-04-2023

Just change the regex in the 2 rex statements

| makeresults
| eval _raw="Group;Value;Data
{'Ala_ABC':'1','Bob_XX_':'2'};{'Ala_ABC','Bob_XX_'};
{'Ala_ABC':1,'Bob_XX_':'2'};{'Ala_ABC'};{'Bob_XX_'}
{'Ala_ABC':1,'Bob_XX_':'2','_c_is_for_Charlie':'3'};{'Ala_ABC'};{'Bob_XX_','_c_is_for_Charlie'}"
| multikv forceheader=1
| table Group Value Data
``` This is your Splunk SPL ```
| rex field=Group max_match=0 "'(?<g>[A-Za-z_]+)':'"
| rex field=Value max_match=0 "'(?<v>[A-Za-z_]+)'"
| eval Calculated_Data=mvmap(g, if(g!=v, g, null()))
| eval Calculated_Data="{'".mvjoin(Calculated_Data, "','")."'}"
| fields - g v

anooshac · ‎09-04-2023

Hi @bowesmana , the method you mentioned is working fine..

Sorry for not mentioning before, the column VALUE can have data like {'a','b'} or a. In such cases how can i change the regex?

bowesmana · ‎09-05-2023

Just add any additional character groupings into the allowed character ranges, i.e.

| rex field=Group max_match=0 "'(?<g>[A-Za-z_\.]+)':'"
| rex field=Value max_match=0 "'(?<v>[A-Za-z_\.]+)'"

How to check if a string is present in a dictionary and print the remaining data

chart

simple XML

Modern way of developing distributed application using OTel

Enterprise Security Content Update (ESCU) | New Releases

Archived Metrics Now Available for APAC and EMEA realms