Splunk Search

Field Extraction (Regex) When Column Is Sometimes Absent

RMartinezDTV
Path Finder

Hi, I'm working on a Regex for field extractions of an alert log. The log has 1 line per alert in the following format:

[11/26/2013 9:13:41 AM]     Server1 LogTest: /var/log   Ok      Text Log test
[11/26/2013 9:13:36 AM]     Server1 LogTest: /var/log   Bad <......data.......> Text Log test

The difficulty comes when handling some OK statuses; you'll notice here that a 'Bad' status returns data (the relevant log lines), but an 'Ok' status returns a blank (actually 2 tabs) data section.

It seems like every regex I come up with will accidentally capture some part of Text Log test and use that as part of all of the data section when data isn't present.

Can I get some pointers on the proper regex expression? My current regex is below, and I think I've exhausted the guess and check method. 🙂

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?)\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)
Tags (2)
0 Karma
1 Solution

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

View solution in original post

0 Karma

RMartinezDTV
Path Finder

Ayn, I actually read your notes here: http://answers.splunk.com/answers/67170/index-time-field-extraction about using search-time extractions....and I just learned what the difference is from the docs!

0 Karma

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

0 Karma

kallu
Communicator

Would it work better if you change the end

(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)

to

(?P<status>.+)\t(?P<data>.*)\t(?P<test>.+)

then you should match to an empty string if there is just 2 tabs in case of "Ok"? It sounds too easy and I didn't test it with Splunk, so maybe I'm missing something?

RMartinezDTV
Path Finder

Thanks! This was almost perfect. See my answer below.

0 Karma

lukejadamec
Super Champion

My mistake, a search time field extraction.

0 Karma

Ayn
Legend

DELIMS doesn't work as an index-time extraction, and index-time extractions should be avoided unless you really know what you're doing and why.

lukejadamec
Super Champion

Have you tried setting this up for search time extraction using the log delimiter and a preset series of fields?

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...