Splunk Search

Regex - match all ocurances of a character

kenchisho
Path Finder

Hi guys,

I have been playing around trying to match multiple ocurances of a pattern and replace it with a regex in transforms.conf.

sample data
Kenan-xMuharemagic-x-xkenan@neseco.ba-x-x

I am trying to match the pattern "-x" and replace it with "_". It works perfect for one match but for the rest of the matches in the event it doesnt. This is not a multiline event and the characters I wish to replace may appear multiple times even in a single word.

Desired Result
Kenan_ Muharemagic_ _ kenan@neseco.ba_ _ (excuse the " " at _)

I was able to do this earlier with the max_matches parameter during search time. This time I am trying to replace these characters before indexing, like when anonimyzing CC numbers or so...

Is there a way to tell splunk to replace all matches of a pattern in transforms.conf?

Tags (1)
0 Karma

kenchisho
Path Finder

I'v already tried that. Been playing with this all day.

The case is that I am indexing a binary encoded log file... Splunk indexes all ASCII characters without a problem... but there are a few non-ASCII characters that are indexed as "\x6\xD1" which would be "Đ".

I've tried modifying the CHARSET but the only one that works is CP852, which is sadly not supported by Splunk.

As for SED I have not been able to match the pattern with a sed regex within Splunk, however when using standalone Regex tools or OSX CLI sed I match and replace the patterns without problems...

I have managed to work this out using transforms.conf with a regex and then applying that in props multiple times (ex. 10 times for a possibe 10 repetitions of the same character in 1 event). This is a very ugly workaround and I will try to find another way.

Example:

raw data: \x6\xD1\x6\xD1\x6\xD1\x6\xD1\x6\xD1

transforms.conf

[bin2text]
REGEX = (?)(.*)\x6\\xD1(.*)
FORMAT = $1Đ$2

props.conf
[sourcetype]
TRANSFORM-test = bin2text, bin2text, bin2text, bin2text, bin2text

result data: ĐĐĐĐĐ

As I said a very ugly solution but the only one I got working. I'm open to suggestions if someone has an idea...

0 Karma

jonuwz
Influencer

anonymizing data

look at the SEDCMD section, and the "global" /g flag

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...