Refine your search:

5
5

When the filesystem that Splunk uses to store its indexes becomes unavailable, goes into read-only mode or Splunk crashes, inconsistencies are sometimes introduced in the metadata files of some indexes and buckets. These files typically are Sources.data, Hosts.data and SourceTypes.data. There is a set of these in the index hot/warm directory, and in each bucket.

The presence of a corrupt metadata file in a bucket of one of the indexes currently used will keep Splunk from restarting. Typically, errors as shown below will show up in $SPLUNK_HOME/var/log/splunk/splunkd.log and Splunk will crash when attempting to start :

ERROR WordPositionData - couldn't parse hash code

Unfortunately as Splunk starts, although splunkd.log reports which index contains a corrupt metadata file it will not indicate in which bucket that file is present or which file that is.

Is there a way to quickly scan an index an all of its buckets to detect which metadata files are corrupted and need to be moved out of the way?

asked 09 Aug '10, 04:23

hexx's gravatar image

hexx ♦
7.6k1941
accept rate: 51%

edited 16 Sep '10, 18:00

piebob's gravatar image

piebob ♦♦
2.4k1517


2 Answers:

There is a command that ships with Splunk and which is capable of checking the consistency of the metadata files of any given index or bucket :

$SPLUNK_HOME/bin/splunk cmd recover-metadata {path_to_index|path_to_bucket} --validate

Note that the "--validate" option will essentially act as "fsck -n" : It will report errors but not make any changes. For a given index, I like to run the script below to check the metadata files at the root of the hot/warm db and then those contained in each bucket :

for i in find "$PATH_TO_INDEX" \( -name db_*_*_*  -o -name hot_v*_* \); do echo "Checking metadata in bucket $i ..."; $SPLUNK_HOME/bin/splunk cmd recover-metadata $i --validate; done; $SPLUNK_HOME/bin/splunk cmd recover-metadata echo $i | sed 's/\(.*\)\/db_[^/]*$/\1/' --validate

or fanned out for readability (at least readable for shellscripts):

for i in `find "$PATH_TO_INDEX" \( -name db_*_*_*  -o -name hot_v*_* \)`; do 
    echo "Checking metadata in bucket $i ..."; 
    $SPLUNK_HOME/bin/splunk cmd recover-metadata $i --validate
done
$SPLUNK_HOME/bin/splunk cmd recover-metadata `echo $i | sed 's/\(.*\)\/db_[^/]*$/\1/'` --validate

"PATH_TO_INDEX" should be the path to the directory of the affected index containing the "db" and "colddb" directories. For the default index ("main"), it is "$SPLUNK_HOME/var/lib/splunk/defaultdb".

Each time an error is reported, the corresponding .data file should be moved out of the way or deleted, as Splunk will rebuild them on the next start up.

Another solution is to create a "meta.dirty" file at the root of the affected index db ($SPLUNK_HOME/var/lib/splunk/defaultdb/db/ for example), which will also dynamically prompt Splunk to rebuild the metadata files for that index.

Once all corrupted metadata files have been removed, the check should be run again. It will indicate errors for those files because they can't be found, but Splunk should be now ready to start.

Repeat the operation for each index for which splunkd.log reports this type of error.

link

answered 09 Aug '10, 04:28

hexx's gravatar image

hexx ♦
7.6k1941
accept rate: 51%

edited 13 Dec '10, 21:27

jrodman's gravatar image

jrodman ♦
5.8k2515

Do note that in most cases, it's the metadata files in the index root directory and/or in it's hot buckets that are responsible for this situation.

(17 Sep '10, 02:57) hexx ♦

As a corrolary to the metadata checker above, the following can be used to check the health of your tsidx (text search) files.

for tsidx_file in $(find "$PATH_TO_INDEX" -type f -name '*.tsidx'); do
   output="$(splunk cmd tsidxprobe "$tsidx_file")"
   tsidxprobe_exit_code=$?
   if [ $tsidxprobe_exit_code != 0 ]; then
      echo tsidxprobe "error: $tsidx_file gave an error; return code: $tsidxprobe_exit_code"
      echo "$output"
   fi
done

The main useful idea here is tsidxprobe returns nonzero on failure, and the output is hard to guess, so store and emit it if it was a fail.

link

answered 13 Dec '10, 20:24

jrodman's gravatar image

jrodman ♦
5.8k2515
accept rate: 42%

edited 13 Dec '10, 21:16

Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×299
×73
×63
×28

Asked: 09 Aug '10, 04:23

Seen: 1,381 times

Last updated: 13 Dec '10, 21:27

Copyright © 2005-2012 Splunk, Inc. All rights reserved.