Splunk Search

How to write the regex to extract a field from XML data if the field is not completely XML?

jameskerivan
Explorer

Hi

I have a field which I would like to extract a field from the XML being displayed. The only problem is the field is not completely XML. I am not allowed to post an example, but basically I want to extract something that looks like:

Event xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"><ns2:behaviorVersion>0</ns2:behaviorVersion><triggers><channelId>0055</channelId><clientVersion>3</clientVersion></triggers><eventInfo><bos:instanceId>000121481</bos:instanceId><bos:serverName>1</bos:serverName><bos:implementationName>TransferStarted</bos:implementationName>

And I would like to grab TransferStarted in between the two tags <bos:implementationName> and </bos:implementationName>.

I have worked with regex in the past, but am still not confident. Any help would be much appreciated and Happy New Year!

0 Karma
1 Solution

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

View solution in original post

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

jameskerivan
Explorer

Yes this is what I want. Right now I am doing

base query | rex field=F "(?.*)implementationName\>(\w+)\<" | stats count by preName | sort count desc

But this is providing me with everything before implementationName as I specified. How would I extract that field? The way I see the regex working is it matches implementationName and looks for the characters > < for opening and closing of the value I want. Do I need to specify a variable for that value?

0 Karma

sundareshr
Legend

Try this, assuming preName is the name you want for that field.

"implementationName>(?<preName>w+)<"
0 Karma

sundareshr
Legend

There should be a backslash before "w+"

0 Karma

jameskerivan
Explorer

So the stats that it gives me is very confusing. Here is my query :

base query | rex field=F "implementationName>(?<preName>\w+)<" | stats count by preName | sort count desc

This is giving me a very small amount of the implemenationNames but it does not give them all. For example TransferStarted did not get counted in my stats but if I look in the events I can see it. Am I missing something?

0 Karma

sundareshr
Legend

If there is more than 1 occurrence of the preName in one event, you should add max_match=0 to the rex command and used multi-value functions to get the right result

0 Karma

jameskerivan
Explorer

Thank you very much. You have been so helpful. The problem I am coming across is with the way we are logging. Your query is correct!

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...