Hi Splunkers,
I have a question regarding the input extraction of XML fields (with inputs and transforms).
I have tried to follow the advice in this post:
https://answers.splunk.com/answers/683/xml-input-line-breaking-and-field-extraction-how.html
but have not been successful yet, since the XML-structure of my data is somehow different.
Here's the data:
<ClientStatistics refDate="2015-11-10T09:47:46.888+01:00"><RequestStatistics><Client created="2015-09-10T23:25:17.523+02:00" id="IDxxxx" lastPoll="2015-11-10T09:47:45.279+01:00" pollCount="3342838" pollThroughput="1563"/><Client created="2015-09-10T23:25:21.751+02:00" id="IDxxxx" lastPoll="2015-11-10T09:46:02.196+01:00" pollCount="45031" pollThroughput="116030"/><Client created="2015-09-10T23:25:30.007+02:00" id="IDxxxx" lastPoll="2015-11-10T09:47:46.850+01:00" pollCount="16640185" pollThroughput="314"/><Client created="2015-09-10T23:25:17.516+02:00" id="IDxxxx" lastPoll="2015-11-10T09:47:46.432+01:00" lastPush="2015-11-10T09:47:46.360+01:00" pollCount="40604184" pollThroughput="129" pushCount="11646891" pushThroughput="449"/><Client created="2015-09-17T11:13:03.268+02:00" id="IDxxxx" lastPoll="2015-09-17T11:29:03.415+02:00" pollCount="9" pollThroughput="120018"/><Client created="2015-09-17T11:16:03.552+02:00" id="IDxxxx" lastPoll="2015-11-09T08:02:02.497+01:00" pollCount="300" pollThroughput="15237597"/></RequestStatistics></ClientStatistics>
Yes, it's pretty unstructured, and it's not clean XML...
I have tried to put KV-MODE = xml
in my inputs.conf, with no effect. Also, the other suggested setting, like BREAK_ONLY_BEFORE
or LINE_BREAKER
did not split my events.
I understand, that there should be the possibility to extract the KV-pairs inside the <Client> Tags somehow, maybe with an additional transform command. I figured it sould be REGEX = (\w+)="([^"]+)"
and FORMAT = $1::$2
inside transforms.conf - but I am missing the connection.
Can somebody please enlight me?
At the risk of duplicating what you've already tried, try these props.conf settings.
SHOULD_LINEMERGE=false
LINE_BREAKER=(><)
TIME_PREFIX=Client created=
At the risk of duplicating what you've already tried, try these props.conf settings.
SHOULD_LINEMERGE=false
LINE_BREAKER=(><)
TIME_PREFIX=Client created=
Thanks a ton - this was a setting I actually didn't try yet 🙂
With one small modification (stripping the closing slash as well) it works perfectly!
SHOULD_LINEMERGE=false
LINE_BREAKER=(/><)
TIME_PREFIX=refDate=
What values of BREAK_ONLY_BEFORE
and LINE_BREAKER
have you tried?
I have tried numerous versions of RegExes, started with a simple '<', '