How to extract field name and field values from mu...

dm1 · ‎03-09-2022

I am indexing email data that Splunk reads from an inbox folder (via TA-mailclient). Those emails contain a csv file that comes as file attachment to the email.

Below is an example where the field name of the attachment is file_content and the field value is below:

Stopped by Reputation Filtering,Stopped as Invalid Recipients,Spam Detected,Virus Detected,Stopped by Content Filter,Total Threat Messages,Clean Messages,Total Attempted Messages
9.28068485506,0.0,45.1350500141,0.00114191624597,1.53311465023,55.9499914356,44.0500085644,--
251946,0,1225297,31,41620,1518894,1195841,2714735

I want to be able to manipulate the results to look like below:

Stopped by Reputation Filtering	Stopped as Invalid Recipients	Spam Detected	Virus Detected	Stopped by Content Filter	Total Threat Messages	Clean Messages	Total Attempted Messages
9.280684855	0	45.13505001	0.001141916	1.53311465	55.94999144	44.05000856	--
251946	0	1225297	31	41620	1518894	1195841	2714735

Can someone please advise how to achieve this ?

yuanliu · ‎03-09-2022

I could swear that I had seen the same question recently but cannot find it. I also do not know an elegant solution. (The ask is really to extract fields from a field of CSV.) So, here we go:

| eval file_content = split(file_content, "
") ``` chop field into lines ```
| eval header = mvindex(file_content, 0)
| eval file_content = mvindex(file_content, 1, -1) ``` get values ```
| mvexpand file_content ``` one value per line ```
| eval _raw = mvzip(split(header, ","), split(file_content, ","), "=") ``` pair headers with values ```
| kv pairdelim="
" kvdelim="="
| fields - _raw file_content header

The sample data will give

Clean Messages	Detected	Filter	Filtering	Messages	Recipients	Spam_Detected	Stopped as Invalid_Recipients	Stopped_by_Content_Filter	Stopped_by_Reputation_Filtering	Total_Attempted_Messages	Total_Threat_Messages	Virus_Detected
44.0500085644	45.1350500141	1.53311465023	9.28068485506	55.9499914356	0.0	45.1350500141	0.0	1.53311465023	9.28068485506	--	55.9499914356	0.00114191624597
1195841	1225297	41620	251946	1518894	0	1225297	0	41620	251946	2714735	1518894	31

dm1 · ‎03-09-2022

@yuanliu Thanks for your prompt reply.

I am trying to understand your query.

When I fully ran your query, the file_content field and values entirely disappeared. Results were just empty lines.

Then I tried to execute query line by line.

| eval file_content = split(file_content, "") ``` chop field into lines ```

1st attempt with the above returned file_content values with each character as single value like,

s

t

o

p

p

e

d

After I added a "," between double quotes | eval file_content = split(file_content, ",") ``` chop field lines ```, it returned each cell value as single value

like

Stopped by Reputation Filtering
Stopped as Invalid Recipients
etc

still figuring out...

yuanliu · ‎03-10-2022

@dm1 As ITWhisperer explained, the split character is a literal newline, which you can enter in search window by holding down Ctrl (or Mac 'control') key while pressing "Enter" (or Mac 'return'). Similarly, pairdelim is also a literal newline.

The last line removes file_content just to show results more clearly. It is not part of the solution.

ITWhisperer · ‎03-09-2022

As @yuanliu showed, the split has to be by new line, you have removed this in your search

| eval file_content = split(file_content, "
") ``` chop field into lines ```

This assume that file_content is a single value with new lines delimiting the rows of the csv, not a multi-value field. If it is a multi-value field already, then the split is unnecessary.

How to extract field name and field values from multi-value field?

eval

field extraction

other

regex

rex

Archived Metrics Now Available for APAC and EMEA realms

Detecting Remote Code Executions With the Splunk Threat Research Team

Enter the Dashboard Challenge and Watch the .conf24 Global Broadcast!