|
I am working with the following input and wanted some advice on how/where to specify the field extractions:
I have documentation from the vendor specifying value lengths and definitions and we can perform most field extractions via individial regex field extractions, but we wanted to know if there is a better or more effecient method recommended. For regerence, the field mapping table is listed below and have included samples for a couple of the current field extractions.
For example, to extract the duration hours, minutes, tenths of minutes we use the following regex:
|
|
A single regular expression is IMO the most efficient way to extract the fields here. To get rid of the \x00 values in your events, you could adjust the LINE_BREAKER settings of your sourcetype: props.conf:
The code:
Does not appear to be removing the "x00x00x00" from the
(07 Dec '10, 20:49)
Toups
|
|
Most efficient would probably be a single search time REGEX extraction:
And so on. That way, all fields come in in a single pass over the data. Note that with this particular data, you may run into some problems searching for particular fields by a specific value (if the value is pressed right up against adjacent fields with no white space). You can deal with those for selected fields if you're commonly searching on them by using index-time extractions, but again, selectively and only if you determine it's really necessary for that field (e.g., don't do it with the time fields, and probably not with the dialed number) It sounds like index time extraction is best as many of the fields are adjancent. Why do you recommend against items such as time or dialed number in the extraction at index? The target application with be a Call Detail Record index, and a sub-component of an event correlation system.
(06 Dec '10, 18:21)
Toups
Because if you're not searching for the specific values, indexing more fields will increase the size of the index, which can decrease performance for all searches. If you are searching rarely for specific values of
(06 Dec '10, 18:44)
gkanapathy ♦
Thank you, I think this is the information we were looking for. Your time and attention is greatly appreciated!
(06 Dec '10, 20:00)
Toups
|
