Splunk Search

Problem with using SOURCE_KEY

dmaislin_splunk
Splunk Employee
Splunk Employee

I have some XML data that I parse into many fields, one of which is "relativePath" why can't I get the transforms to extract a new field "fileName" from the SOURCE_KEY? The rex command works fine in the search bar:

  | rex field=relativePath "^.*[\\\/](?<fileName>.*)"

Sample Event:

<CheckEventRequest>
  <EventList count="1">
    <Event event="0x20000" path="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\davidpoc2" flag="0x2" protocol="0" server="CEPAPOC" share="server2fs1" clientIP="10.0.0.2" serverIP="10.0.0.4" timeStamp="0x4EF4883C00014D1D" userSid="S-1-5-21-175151209-4036982877-1867759480-500" ownerSid="S-1-5-32-544" fileSize="0x0" newName="\\cepapoc.emcsplunk.com\CHECK$\server2fs1\SplunkEMC" desiredAccess="0x0" createDispo="0x0" ntStatus="0x0" relativePath="\\CEPAPOC\server2fs1\davidpoc2"/>
  </EventList>
</CheckEventRequest>

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-xmlkv = xmlkv-alternative
REPORT-getFileName = getFileName

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = ^.*[\\\/](.*)
FORMAT = fileName::"$1"
Tags (1)
0 Karma
1 Solution

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$

View solution in original post

0 Karma

dmaislin_splunk
Splunk Employee
Splunk Employee

The answer is.....

The data was using autokv to extract all the delimited fields, not my xmlkv-alternative. SOURCE_KEY does not work well with the default splunk autokv. I replaced it with kv-alternative.

props.conf

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-parsefields = kv-alternative,getFileName
TRANSFORMS-removehb = removehb
LOOKUP-event = eventlookup event OUTPUTNEW event_description
LOOKUP-dispo = dispolookup createDispo OUTPUTNEW createDispo_Description
KV_MODE = none

transforms.conf

[kv-alternative]
REGEX = (\w+)="([^"]+)"
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
REGEX = (?<fileName>[^\\]+)$
0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

This is most likely related to a bad regex. Assuming relativePath="\CEPAPOC\server2fs1\davidpoc2" and that you want extract fileName=davidpoc2 then the following should do the trick (note the updated regex in getFileName)

props.conf
[cepa] 
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18 
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml 
REPORT-my_name = xmlkv-alternative, getFileName

transforms.conf
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
MV_ADD = True

[getFileName]
SOURCE_KEY = relativePath
# need to extract filenames from unix and windows paths, so use both forward/backward slashes
REGEX = (?<fileName>[^\\/]+)$

_d_
Splunk Employee
Splunk Employee

Perhaps could be hitting the problem described here: http://blogs.splunk.com/2011/10/07/cannot-search-based-on-an-extracted-field/ 🙂

0 Karma

_d_
Splunk Employee
Splunk Employee

See if this works:

[cepa]
LINE_BREAKER = ([\r\n]+20\d{2}/[01]\d/[0123]\d\s[012]\d:[0-5]\d:[0-5]\d[\r\n]+)
SHOULD_LINEMERGE = FALSE
TIME_PREFIX = timeStamp="
MAX_TIMESTAMP_LOOKAHEAD = 18
DATETIME_CONFIG = /etc/apps/vnx/default/emc-epoch.xml
REPORT-my_name = xmlkv-alternative, getFileName

This particular REPORT sequence insures that the [xmlkv-alternative] transform stanza gets applied first, then [getFileName].

Hope this helps.

> please upvote and accept answer if you find it useful - thanks!

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...