Refine your search:

The logs I'm trying to index are in a log4j style, and entries such as

2010-06-15 09:04:08,204 [[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel
.Default (self-tuning)'][intOrdId=17746,intVrsn=18846] DEBUG com.att.canopi.idis
.ordermanagement.sms.cramer.adapter.util.ReflectionsTools.ipagi1a1.snt.bst.bls.c
om - Set value1:Customer100 field name1:customerName

are properly split into unique entries, however some entries are multi-line, and have embedded XML, and these are split wherever a date (with a different format than the date at the start of an entry) is found, so the log entry

2010-06-15 09:04:07,686 [[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel
.Default (self-tuning)'][intOrdId=17746,intVrsn=18846] INFO  fooHandler - <env:Envelope xmlns:env="
http://schemas.xmlsoap.org/soap/envelope/">
  <env:Header/>
  <env:Body>
    <v1:getFooDetailRequest xmlns:v1="ns1" correlationId="coid" systemId="mybox" clientId="cid" mock="false" requestTime="2010-06-15T09:04:07.643-04:00">
       <v1:fooId>foo<v1:fooId>
    </v1:getFooDetailRequest>
  </env:Body>
</env:Envelope>

gets split on the "getFooDetailRequest" line into two entries.

I've tried writing my own sourcetype in a props.conf with

MORE_THAN_100 = ^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \[.*?

and setting the files to that sourcetype manually, but I get the same result.

Does anyone know how to modify the built-in log4j sourcetype (since it is so close to being perfect), or have any other suggestions?

asked 15 Jun '10, 19:05

Adam's gravatar image

Adam
584
accept rate: 100%


One Answer:

You can override any Splunk default configurations by setting the corresponding setting name (under the same stanza header, in this case [log4j]), in the "local" directory. Basically, a setting in a "local" dir version of a file overrides the corresponding setting in the corresponding stanza in the "default" dir. Although the fact is, it's probably better to just define it over:

[log4j]
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25

This is probably the easiest to understand, though for high-volume systems (> 100 GB/day), use:

[log4j]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2},\d{3})
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
link

answered 15 Jun '10, 19:26

gkanapathy's gravatar image

gkanapathy ♦
26.4k1622
accept rate: 42%

That worked! I was a little concerned because I saw a few items that still weren't split properly, but after a couple of minutes all the kinks worked out and now even XML with 10 timestamps isn't split. Thanks!

(15 Jun '10, 20:05) Adam
Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×154

Asked: 15 Jun '10, 19:05

Seen: 673 times

Last updated: 15 Jun '10, 19:26

Copyright © 2005-2012 Splunk, Inc. All rights reserved.