Splunk Search

Field Extraction - Separate on Colon?

holtb
Explorer

I'm trying to extract -all- the fields from a rather complex Oracle Grid Engine log file with a format like this:

all.q:50s01fb:clusterusers:zarakhov:transc4desc.fasta_$SGE_TASK_ID.qsub:5161781:sge:0:1344452577:1344452604:1344452617:0:0:13:8.781664:0.452931:0.000000:0:0:0:0:332711:716:0:0.000000:0:0:0:0:12147:13606:normal:MPRICompGenomics:NONE:1:987:9.234595:1.061786:0.000000:-u zarakhov -q all.q -l h_rt=21600,h_vmem=2G,mem_free=2G,mem_reserve=2G,virtual_free=2G:0.000000:NONE:383160320.000000:0:0

Obviously, I'd like to separate on the colons and dump each of the variables into a sequentially labeled variable (fieldname1-fieldname27 or whatnot). I've found fairly easy ways to extract the fields one at a time, is there a mechanism for doing in a more efficient way? Like the split command in perl?

Tags (1)
1 Solution

dmaislin_splunk
Splunk Employee
Splunk Employee

I think I get what you are trying to do and you need to break it into phases. Phase A gives you all 45 fields delimited with the colon, Phase B was a bit weird so field f40 is broken into two with the -l flag as the delimiter. Phase C breaks f46 apart using a REGEX by flag type. Finally, Phase D uses nested delims to break the f47 field into the final name/value pair. If you try this, just run a search:

sourcetype="yoursourcetype" | table f* h_* mem* vir*

props.conf

[yoursourcetype]
REPORT-extractions = a,b,c,d

transforms.conf

[a]
DELIMS = ":"
FIELDS= f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24, f25, f26, f27, f28, f29, f30, f31, f32, f33, f34, f35, f36, f37, f38, f39, f40, f41, f42, f43, f44, f45

[b]
SOURCE_KEY = f40
REGEX = (.*)-l(.*)
FORMAT = f46::$1 f47::$2

[c]
SOURCE_KEY = f46
REGEX = -u (.+?) -q (.+?)$
FORMAT = f48::$1 f49::$2

[d]
SOURCE_KEY=f47
DELIMS = ",","="

View solution in original post

dmaislin_splunk
Splunk Employee
Splunk Employee

I think I get what you are trying to do and you need to break it into phases. Phase A gives you all 45 fields delimited with the colon, Phase B was a bit weird so field f40 is broken into two with the -l flag as the delimiter. Phase C breaks f46 apart using a REGEX by flag type. Finally, Phase D uses nested delims to break the f47 field into the final name/value pair. If you try this, just run a search:

sourcetype="yoursourcetype" | table f* h_* mem* vir*

props.conf

[yoursourcetype]
REPORT-extractions = a,b,c,d

transforms.conf

[a]
DELIMS = ":"
FIELDS= f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19, f20, f21, f22, f23, f24, f25, f26, f27, f28, f29, f30, f31, f32, f33, f34, f35, f36, f37, f38, f39, f40, f41, f42, f43, f44, f45

[b]
SOURCE_KEY = f40
REGEX = (.*)-l(.*)
FORMAT = f46::$1 f47::$2

[c]
SOURCE_KEY = f46
REGEX = -u (.+?) -q (.+?)$
FORMAT = f48::$1 f49::$2

[d]
SOURCE_KEY=f47
DELIMS = ",","="

dmaislin_splunk
Splunk Employee
Splunk Employee

Does this answer your question? If so, please accept.

0 Karma

yannK
Splunk Employee
Splunk Employee

take a look at the extract command for the search command.
... | extract pairdelim=",", kvdelim="="
http://docs.splunk.com/Documentation/Splunk/4.3.3/SearchReference/Extract

The equivalent exists for the configuration file /manager for automation.

0 Karma

holtb
Explorer

That's pretty neat, but I don't care about the key=value pairs in the middle, I'd rather keep that whole section intact. I just want to separate the many fields by : so I can search against them separately.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...