|
How can I configure Splunk to extract some fields from the source filename. I already specify a host_regex and that works great. Also I understand that if there is a date in the filename, splunk will find it automatically. The field can be extracted at index-time if it must. I have Splunk watch a lot of files and directories. For some source types, there are fields in the filename that aren't the 'host', or a 'date' field. Furthermore these fields aren't repeated in the event data themselves (i.e. not in the file content, only in the filename). Here's an example from a host collecting oracle alert logs,.
This might have been hit already, but I'm having some difficulty finding an answer that doesn't involve an automatically located field. |
|
Hi, as a rule of thumb, it is bad to have splunk index new fields if not really necessary (higher burden on the indexer and so on). What you might need most is a search-time field extraction that you can configure like this. Suppose your oracle alert logs have the sourcetype "oracle_alert", then in local/props.conf:
That would instruct splunk to extract 3 fields: logdir (anything before the last /), host_2 (which I renamed to not override the original "host" field), and sid. You don't need to modify fields.conf for this. Another method would be to also use transforms.conf For further info on the alternative methods, you can write a comment here or refer to: Props.conf documentation and search for the keyword "EXTRACT". If you want to test the regex before applying the configuration, you can use the rex command on the search bar; in this case, you could run a search like:
and check that the three fields appear on the left field-picker menu. Hope that helped a bit, Paolo |
|
You should be able to just define a transform.conf with SOURCE_KEY set as "source" and a REGEX defining your fieldname. Something like:
In your props.conf your reference the "a_transform" such as:
You'll probably also have to define the fieldname in fields.conf as well since field value would not have been indexed; such as:
Note that if you search for this field alone, because it's marked as a non-indexed value, Splunk will perform a full table scan to find matches. To get around this performance issue, you could extract the field at index time, set up a lookup table that maps all sources to your fields, or set up a set of eventtypes.
(24 Aug '10, 17:03)
Stephen Sorkin ♦
|

Thanks, I know to avoid indexing fields, I just knew that 'host' was indexed, so I wasn't sure how fields form the filename where going to work out. I understand now though. Thanks both of you.