Getting Data In

How to index email values without special characters?

andresito123
Communicator

Hello to the community!

I have an email field with values following this pattern: <example@example.com>

Is there any way to remove the special characters < and > and index the value as example@example.com?

Thanks!

0 Karma
1 Solution

Raghav2384
Motivator

Hello,

Several ways to do it. Please check this accepted answer

https://answers.splunk.com/answers/172300/how-to-extract-the-email-address-from-the-my-logs.html

Also, you could achieve something similar with SEDCMD. PLease see the props.conf

http://docs.splunk.com/Documentation/Splunk/latest/admin/Propsconf

  • Syntax:
    • replace - s/regex/replacement/flags
      • regex is a perl regular expression (optionally containing capturing groups).
      • replacement is a string to replace the regex match. Use \n for back references, where "n" is a single digit.
      • flags can be either: g to replace all matches, or a number to replace a specified match.
    • substitute - y/string1/string2/
      • substitutes the string1[i] with string2[i]

simple example: SEDCMD-hash = "s/this/that/g"

Make sure it doesn't conflict with other <> in the same log.

Hope this helps!

Thanks,
Raghav

View solution in original post

0 Karma

woodcock
Esteemed Legend

Like this:

... | rex field=MyEmailFieldName mode=sed "s/[<>]//g"

andresito123
Communicator

This works, but I want it to have in indexing time. I don't want the special characters to show up and need to map it on CIM so as Enterprise Security will correlate this info as an email without < and >.

0 Karma

woodcock
Esteemed Legend

Then use SEDCMD (with the same sed string without the quotes)"

http://docs.splunk.com/Documentation/Splunk/6.0.3/Data/Anonymizedatausingconfigurationfiles

0 Karma

Raghav2384
Motivator

Hello,

Several ways to do it. Please check this accepted answer

https://answers.splunk.com/answers/172300/how-to-extract-the-email-address-from-the-my-logs.html

Also, you could achieve something similar with SEDCMD. PLease see the props.conf

http://docs.splunk.com/Documentation/Splunk/latest/admin/Propsconf

  • Syntax:
    • replace - s/regex/replacement/flags
      • regex is a perl regular expression (optionally containing capturing groups).
      • replacement is a string to replace the regex match. Use \n for back references, where "n" is a single digit.
      • flags can be either: g to replace all matches, or a number to replace a specified match.
    • substitute - y/string1/string2/
      • substitutes the string1[i] with string2[i]

simple example: SEDCMD-hash = "s/this/that/g"

Make sure it doesn't conflict with other <> in the same log.

Hope this helps!

Thanks,
Raghav

0 Karma

andresito123
Communicator

I have put in my /opt/splunk/etc/system/local/props.conf the following:

[mysourcetype]
SEDCMD-stripEmail = "s/[<>]//g"

But it seems that the emails are indexed as: <example@example.com>.

Any ideas?

0 Karma

Raghav2384
Motivator

Try without quotes

[mysourcetype]
SEDCMD-stripemail=s/[<>]//g as @woodcock stated

Hope this helps!

Thanks,
Raghav

0 Karma

ppablo
Retired

Hi @andresito123

I think the pattern you were trying to show didn't render properly. I would edit your question and re-paste your sample pattern, but be sure to use the text editing tools. Highlight your code, then click on the "Code Sample" button for it to display.

0 Karma

andresito123
Communicator

Fixed with an edit!

0 Karma
Get Updates on the Splunk Community!

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...