How do create indexed fields in a summary index?

Lowell · ‎01-25-2017

I'm populating a summary index with data that I would like to be able to search very quickly using tstats. I've got this mostly working but can't quite seem to figure out if I'm doing something wrong or why it isn't working as expected.

Summary index generating search: search_foo
Fields to index: a, b, c, d
I want to be able to write a search like this: | tstats sum(a), sum(b), values(c) WHERE index=summary source=search_foo by d

Here are the settings I'm trying to make work:

props.conf:

[source::search_foo]
TRANSFORMS-index-fields = search_foo_indexfields

transforms.conf:

[search_foo_indexfields]
REGEX = \b(a|b|c|d)=("?)([^"]*?)\2(?:,|$)
FORMAT = $1::$3
WRITE_META = true
REPEAT_MATCH = true

I know that I have all the names and meta settings correctly because the first field does get added as an indexed field. (I confirmed this by running exporttool -csv on one of the buckets and confirmed that the field showed up in the _meta field. Splunk seems to be ignoring the REPEAT_MATCH setting.

So as a workaround, I've made REGEX match all 4 fields directly and index them all at once. (e.g., FORMAT = a::$1 b::$2 c::$3 d::$4) This works, but I really don't like the approach because it assumes a hard-coded order of the fields, which seems unnecessarily fragile. In my actual use case, sometimes "a" or "b'' is missing from the data. I've been able to make the regex cope with that fact, but that still results in an empty indexed field. (In other words, if "b" is missing form the data, I still see b:: in _meta when I run exporttool.) I also considered making 4 transforms entries, one for each field, but that seems silly as well.

Bonus question: Here's one somewhat related question, how to I avoid double escaping backslashes in my solution. One of my actual fields a "source", so Window's paths show up in the raw data with escaped backslashes ( \\ ) which gets translated to double escaped ( \\\\ ) in the _meta field, which then means that at search time, the indexed fields look like "C:\Windows\.." instead of "C:\Window...".

woodcock · ‎03-05-2017

Double-check that the source value for the data in your Summary Index matches your stanza header specification.

woodcock · ‎01-25-2017

Many people do not know about _KEY_1 and _VAL_1 (you can search on it). Try this:

[search_foo_indexfields]
REGEX = \b(?<_KEY_1>a|b|c|d)=("?)(?<_VAL_1>[^"]*?)\2(?:,|$)
WRITE_META = true
MV_ADD= true

Lowell · ‎01-25-2017

Okay, so this adds a new field with the name of the transforms stanza ("search_foo_indexfields") with the value of either "a" or "b".

Just confirmed it in the _meta field dumped out with exporttool. "... date_mday::25 date_zone::0 search_foo_indexfields::a"

From the docs, it's not 100% clear if _KEY_x and _VAL_x is supported at index time, but it doesn't seem to be working.

woodcock · ‎03-05-2017

You have to deploy these configurations to the INDEXING SERVER. In most cases this is your indexers HOWEVER in the case of Summary Indices, by default (unless you went out of your way to change it), these are stored on the SEARCH HEAD so you will have to EITHER deploy the configurations to the Search Head OR make sure that Summary Indexing happens on the Indexers.

Lowell · ‎01-25-2017

I assume _KEY_! is a typo for _KEY_1? I was aware of that syntax, but didn't think it held any advantages here. (But I'll give it a try.) I haven't tried MV_ADD as the docs say, "This attribute is only valid for search-time field extractions."

woodcock · ‎01-25-2017

Yes, fixed.

somesoni2 · ‎01-25-2017

How about creating separate TRANSFORMS stanza for each field, so that even if one field is missing, the other show up independently?
For double escaping, may be try applying some command in the summary index search to remove escaped backslash.

Lowell · ‎01-25-2017

I'd like to avoid on transforms stanza per field if possible. My real use case has more than just 4 fields. (Not an unmanageable number, just seems like the has to be a better solution.)

I'm pretty sure the backslash escaping is happing automatically by the summary indexing plumbing commands (I'm just using the defaults builtin alert actions for summary indexing) And in fact, I'm already dealing with escaped backlashes in part of my search, so I know the've been taken care of in my base search.

And yes I could remove them at search time, but since I'm in control of the data generation, it seems silly to deal with something in every search I write, if I could fix the issue once when the data is written.

somesoni2 · ‎01-25-2017

Give this a try?

[search_foo_indexfields]
 REGEX = \b(?<_KEY_1>(a|b|c|d))=("?)(?<_VAL_1>[^"]*?)\2(?:,|$)
 WRITE_META = true
 REPEAT_MATCH = true

How do create indexed fields in a summary index?

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases