Getting Data In

Why are my files being re-indexed?

SK110176
Path Finder

I'm noticed tons of duplicate events and the following message in splunkd.log correlates with the time I started seeing the dupes. It also started after I upgraded from v4.0.9 to v4.1.4:

"File too small to check seekcrc, probably truncated. Will re-read entire file=....."

Does anyone know why this is occurring?

My settings in inputs.conf include:

crcSalt = <SOURCE>
followtail = 1

I've already checkd for the following and none of these apply:

Causes of reindexing:

File contents (especially the first 256 bytes) are modified in-place. This shouldn't happen for log files (they're supposed to be a record).

The CHECK_METHOD for the files was set to entire_md5 or modtime. This forces the files to be reindexed.

Some sourcetypes like 'text_file' intentionally set the CHECK_METHOD because it is desired to index the complete file each time.

Tags (1)

Genti
Splunk Employee
Splunk Employee

crcSalt =
followtail = 1

crcSalt =
* Use this to force Splunk to consume files with matching CRCs.
* Set any string to add to the CRC.
* If set to "crcSalt = ", then the full source path is added to the CRC.

Im assuming after the upgrade splunk is reading a different CRC, and this is causing the double indexing.

Get Updates on the Splunk Community!

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Combine Multiline Logs into a Single Event with SOCK: a Step-by-Step Guide for ...

Combine multiline logs into a single event with SOCK - a step-by-step guide for newbies Olga Malita The ...