Getting Data In

Source or Sourcetype override

nags
Engager

I have sourcetype based definition in which I mentioned INDEXED_EXTRACTION=JSON. Under this sourcetype there are 10 sources configured. Out of 10, let us say one is not in JSON format. So how to use same sourcetype but no need to mentioned INDEXED_EXTRACTION=JSON for that particular source alone? I thought of using source:: based extraction in props with other attributes and not mentioning this INDEXED_EXTRACTION attribute. In that case will it be considered from the sourcetype declaration?

Labels (1)
Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

sourcetype defining your log file schema. For that reason it must be different for different log files/log event types. Here is one excellent description for it by Mark McCullough (Splunk Slack #bestpractices)

--8<--

I think I've finally figured out how to explain to "I know Splunk!" types what the sourcetype field means in a way that doesn't cause them to want to pick _json for everything that uses  JSON syntax:  "It's like a reference to a XSD file for XML.  It specifies what fields are required, what fields are permitted, and the overall structure of the event."

--8<--

There is also some naming standards for KO in Splunk which helps you to manage all these KOs. In most cases I'm using naming schema "owner:system/vendor:app:subsystem:log file:#" There is no need to keep all those, but usually it has at least three of those and number as a suffix. When the format of log changed later I just increment last digit by one.

In most times when you have Splunk system where are even couple of different business / tech systems you should use this kind of naming standard for all your KO like apps, indexes, saved searches, alerts etc. This will help you and at least it helps your splunk admins.

r. Ismo

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I'm not 100% sure what you want. As you can see in the docs (https://docs.splunk.com/Documentation/Splunk/Latest/Admin/Propsconf), you can define settings based on sourcetype, source or host so that some of the settings can be selectively applied to your sources.

But the main question here is - why do you want to have INDEXED_EXTRACTIONS (there is "S" at the end, it's important!). INDEXED_EXTRACTIONS are sometimes inevitable (if the log file has variable order of columns and the field order is determined by the header row, it's the only way to reasonably ingest such file) but often search-time parsing is enough and generally with Splunk search-time operations are the preferred method. So why not KV_MODE=json instead of INDEXED_EXTRACTIONS?

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @nags,

it isn't possible: sourcetype defines the specification of a data source (one of them is INDEXED_EXTRACTIONS) so you cannot use the same data definition for different data sources.

As a workaround, you could use a similar sourcetype (e.g. my_sourcetype and my_sourcetype_json) so in the searches you can use: 

sourcetype=my_sourcetype*

and take both of them.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...