Splunk Search

regex TEXT

tb582
Explorer

Hopufully a quick one but I'm looking to search and extract anything between two these fields TEXT anyone know how?

Tags (1)
0 Karma

dwaddle
SplunkTrust
SplunkTrust

The extraction is simple:

| rex "<title>(?<title_text>.*)</title>"

The searching part I'll leave as an exercise...

Ayn
Legend

Well it's not going to do much good within a subsearch. You need to add it to your main search.

0 Karma

tb5821
Communicator

I took it out to see if it made a difference... it would look like so at the very end ... | table task_id owner_text title_text

0 Karma

cphair
Builder

It may be the join that's messing you up. Try it without the subsearch:


index=myindex sourcetype=my-app host=04 OR host=050 Status_type="ERROR" NOT "The remote server returned an error: (401) Unauthorized." | rex "(?<title_text>.)" | rex "(?.)" | rename id AS "Asset" | table "Asset" task_id owner_text title_text

Since you're doing a left join anyway, you're keeping all the results from the original search, so you don't have to do another search over the same data cut.

0 Karma

Ayn
Legend

So where's the | table command that you reportedly were using?

0 Karma

tb5821
Communicator

sorry, that pipe should of been included in the post.

Not sure if a where would be better, I'm new to splunk 🙂

0 Karma

cphair
Builder

Not sure if the forum mangled your syntax, but you're missing a pipe character between OR "" and the first rex. Also, wouldn't a where command work better than a join on a subsearch? Something like "where NOT like(status_type, "(401) Unauthorized".

0 Karma

tb5821
Communicator
index=myindex sourcetype=my-app host=04* OR host=050 Status_type="ERROR" NOT "The remote server returned an error: (401) Unauthorized." | join type=left task_id [search iindex=myindex sourcetype=my-app host=04* OR host=050 "<title>" OR "<owner>" rex "<title>(?<title_text>.*)</title>" | rex "<owner>(?<owner_text>.*)</owner>" | rename id AS "Asset" | fields "Asset" task_id owner_text title_text]
0 Karma

Ayn
Legend

As both cphair and me have tried these suggestions ourselves with the expected results I think it would be a good idea for you to paste a sample event. With the rex and table commands at the end, you really should be seeing only what's between the title opening and closing tags.

tb582
Explorer

Correct, I'm using rex not sure what you mean by field= since its just a string and not an actual extracted field

0 Karma

cphair
Builder

I tried dwaddle's solution on my data and it worked fine. You are piping to rex, and not regex as you say in your title and first comment, correct? No newlines in your data? Does it make a difference if you specify field= to rex?

0 Karma

tb5821
Communicator

Yes, and in the table I get the entire line...

MytitletestA doctor.My copyTV-MAmy owner

0 Karma

Ayn
Legend

And you applied the table command I wrote at the end of your search?

0 Karma

tb5821
Communicator

No, there's only one set of title tags but its all contained within one long line that has other tags.

0 Karma

Ayn
Legend

Do you have multiple <title> tags for some weird reason? In that case, you will want to make the regex match non-greedy:

| rex "<title>(?<title_text>.*?)</title>"

tb5821
Communicator

its still giving me the entire line and not just whats between the tags.

0 Karma

Ayn
Legend

| table title_text

0 Karma

tb582
Explorer

Great Thanks, but I seem to be returning a huge line as theres a bunch of tags such as TEXTTEXTTEXT how do I limit it only to the text between the tags that I'm looking for? my search is "" | regex ...

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...