Splunk Search

Microsoft O365 add-on: How to extract data from emails?

sanju2408de
Explorer

I am facing challenges while extracting the data from emails, using the Microsoft O365 email add on.

I want to extract the "Requested for" and "Finished" for which respective values are "ABC.ITGLOBAL@XYZ.com" and "Fri, Mar 11 2022 15:09:29 GMT+00:00".

I have tried Regex101 site and could successfully test a Regex pattern as below for matching the value for "Requested for" but the same pattern doesn't work in Splunk.

(?i) for\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\S+\s+\w+\-\w+:\w+\-\w+\"\>(?P<Requested_For>\S+)(?=\<\/td)

I need help here to sort this out, please if anyone can share their thoughts here.

Finished</td><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:normal; width:99%; word-break:break-word">Fri, Mar 11 2022 15:09:29 GMT+00:00</td></tr><tr><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:nowrap; font-weight:600; min-width:130px">Requested for</td><td class="" style="vertical-align:top; padding:10px 4px; border-bottom:solid #eaeaea 1px; text-align:left; white-space:normal; width:99%; word-break:break-word">ABC.ITGLOBAL@XYZ.com</td></tr><tr><td class=""

 

Labels (1)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Try this regex.  It tries to avoid depending on the number of words between groups.

Finished\<\/td>\<[^>]+>(?<Finished>[^\<]+).*?Requested for\<\/td>\<[^\<]+>(?<Requested_For>[^\<]+)

Also, the ?i flag most likely is not needed since the keywords in the data ("Finished" and "Requested for") probably will always be the same. 

---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this regex.  It tries to avoid depending on the number of words between groups.

Finished\<\/td>\<[^>]+>(?<Finished>[^\<]+).*?Requested for\<\/td>\<[^\<]+>(?<Requested_For>[^\<]+)

Also, the ?i flag most likely is not needed since the keywords in the data ("Finished" and "Requested for") probably will always be the same. 

---
If this reply helps you, Karma would be appreciated.
0 Karma

sanju2408de
Explorer

@richgalloway Thanks so much for your help, this actually worked.

We had few more fields to extract from the same email and i used the same regex patterns as you have provided. It perfectly worked.

 

Many Thanks again for your help.

0 Karma
Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...