Refine your search:

How can I configure Splunk to extract some fields from the source filename.

I already specify a host_regex and that works great. Also I understand that if there is a date in the filename, splunk will find it automatically. The field can be extracted at index-time if it must.

I have Splunk watch a lot of files and directories. For some source types, there are fields in the filename that aren't the 'host', or a 'date' field. Furthermore these fields aren't repeated in the event data themselves (i.e. not in the file content, only in the filename).

Here's an example from a host collecting oracle alert logs,.

<logdir>/<host>.<sid>.log

/tmp/splunk_alert_logs/db01.TOOL.log

This might have been hit already, but I'm having some difficulty finding an answer that doesn't involve an automatically located field.

asked 24 Aug '10, 08:07

jstillwell's gravatar image

jstillwell
32
accept rate: 0%

edited 24 Aug '10, 16:47

gkanapathy's gravatar image

gkanapathy ♦
26.5k1622

Thanks, I know to avoid indexing fields, I just knew that 'host' was indexed, so I wasn't sure how fields form the filename where going to work out. I understand now though. Thanks both of you.

(24 Aug '10, 19:56) jstillwell

2 Answers:

Hi, as a rule of thumb, it is bad to have splunk index new fields if not really necessary (higher burden on the indexer and so on). What you might need most is a search-time field extraction that you can configure like this.

Suppose your oracle alert logs have the sourcetype "oracle_alert", then in local/props.conf:

[oracle_alert]
EXTRACT-sourcefields = (?<logdir>[\w\W/]+)/(?<host_2>[^\.]+)\.(?<sid>[^\.]+)\.log in source
# (double check the regex) (edit: the "in source" is what tells splunk to look into the source field)

That would instruct splunk to extract 3 fields: logdir (anything before the last /), host_2 (which I renamed to not override the original "host" field), and sid.

You don't need to modify fields.conf for this.

Another method would be to also use transforms.conf

For further info on the alternative methods, you can write a comment here or refer to: Props.conf documentation and search for the keyword "EXTRACT".

If you want to test the regex before applying the configuration, you can use the rex command on the search bar; in this case, you could run a search like:

sourcetype=oracle_alert | rex field=source max_match=10 "(?<logdir>[\w\W/]+)/(?<host_2>[^\.]+)\.(?<sid>[^\.]+)\.log"

and check that the three fields appear on the left field-picker menu.

Hope that helped a bit, Paolo

link

answered 24 Aug '10, 16:15

Paolo%20Prigione's gravatar image

Paolo Prigione
1.5k111
accept rate: 35%

You should be able to just define a transform.conf with SOURCE_KEY set as "source" and a REGEX defining your fieldname. Something like:

[a_transform]  
SOURCE_KEY = source  
REGEX = (?i)[\/A-Za-z]+\/(?<give_it_a_fieldname>\w+)(?=\.\w+)  

In your props.conf your reference the "a_transform" such as:

[a_sourcetype]  
REPORT-transform = a_transform  

You'll probably also have to define the fieldname in fields.conf as well since field value would not have been indexed; such as:

[give_it_a_fieldname]  
INDEXED_VALUE = false
link

answered 24 Aug '10, 11:15

ayme's gravatar image

ayme
2464
accept rate: 28%

edited 24 Aug '10, 16:46

gkanapathy's gravatar image

gkanapathy ♦
26.5k1622

Note that if you search for this field alone, because it's marked as a non-indexed value, Splunk will perform a full table scan to find matches. To get around this performance issue, you could extract the field at index time, set up a lookup table that maps all sources to your fields, or set up a set of eventtypes.

(24 Aug '10, 17:03) Stephen Sorkin ♦
Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×354

Asked: 24 Aug '10, 08:07

Seen: 1,472 times

Last updated: 24 Aug '10, 16:47

Copyright © 2005-2012 Splunk, Inc. All rights reserved.