Splunk Search

Field Extraction (Regex) When Column Is Sometimes Absent

RMartinezDTV
Path Finder

Hi, I'm working on a Regex for field extractions of an alert log. The log has 1 line per alert in the following format:

[11/26/2013 9:13:41 AM]     Server1 LogTest: /var/log   Ok      Text Log test
[11/26/2013 9:13:36 AM]     Server1 LogTest: /var/log   Bad <......data.......> Text Log test

The difficulty comes when handling some OK statuses; you'll notice here that a 'Bad' status returns data (the relevant log lines), but an 'Ok' status returns a blank (actually 2 tabs) data section.

It seems like every regex I come up with will accidentally capture some part of Text Log test and use that as part of all of the data section when data isn't present.

Can I get some pointers on the proper regex expression? My current regex is below, and I think I've exhausted the guess and check method. 🙂

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?)\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)
Tags (2)
0 Karma
1 Solution

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

View solution in original post

0 Karma

RMartinezDTV
Path Finder

Ayn, I actually read your notes here: http://answers.splunk.com/answers/67170/index-time-field-extraction about using search-time extractions....and I just learned what the difference is from the docs!

0 Karma

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

0 Karma

kallu
Communicator

Would it work better if you change the end

(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)

to

(?P<status>.+)\t(?P<data>.*)\t(?P<test>.+)

then you should match to an empty string if there is just 2 tabs in case of "Ok"? It sounds too easy and I didn't test it with Splunk, so maybe I'm missing something?

RMartinezDTV
Path Finder

Thanks! This was almost perfect. See my answer below.

0 Karma

lukejadamec
Super Champion

My mistake, a search time field extraction.

0 Karma

Ayn
Legend

DELIMS doesn't work as an index-time extraction, and index-time extractions should be avoided unless you really know what you're doing and why.

lukejadamec
Super Champion

Have you tried setting this up for search time extraction using the log delimiter and a preset series of fields?

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...