Deployment Architecture

Proactively monitor for bucket corruption

jamesoconnell
Path Finder

I just repaired corrupt buckets for a partner index on one of our enterprise indexers.
The issue only became apparent after the customer saw the warnings on their reports.

My question is: are there easy proactive warnings the administrators can receive highlighting index bucket corruption -- rather than leaving it up to our customers to find the problems.

0 Karma
1 Solution

bheemireddi
Communicator

If you are using "monitoring console" that would be a good starting point. It has the visibility into monitoring Indexer clustering activities. Below link might get you started, these are all the dashboards/searches, so may be you can setup the alerts on them. Also on the cluster master settings->indexer clustering might give you some insights too.
https://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Viewindexerclusteringstatus

View solution in original post

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

we can found corrupted buckets from multisite cluster by next search / alert:

index=_internal component=CMMaster state=Discard incoming_bucket_size=* earliest=-30d@d 
| dedup bid 
| table _time,bid,peer_name,existing_bucket_size,incoming_bucket_size
| sort bid,_time

This shows bucket id + source peer.

r. Ismo

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Even this is old case, I would like to add which the one can do with current versions.

Just run this:

| dbinspect index=* OR index=_* corruptonly=true 
| search state!=hot

Select enough long time period to found all corrupted buckets.

r. Ismo 

sloshburch
Splunk Employee
Splunk Employee

A peer of mine shared this search. Does it jive with your environment? I wanna see if we can add these things into the MC as well so I'm curious to hear how you make out.

index=_internal sourcetype=splunkd component=ProcessTracker (BucketBuilder OR JournalSlice) (NOT "rawdata was truncated")
|eval message=replace(message, "^\(child.*?\)\s+", "")
|bin _time span=1m
|stats c by _time, host, splunk_server, message
|fields - c
|rename splunk_server as Indexer, host as Host, message as Issue
0 Karma

jamesoconnell
Path Finder

Thank you Mr. Burch. I tried running this but didn't get any results.

This could either mean that we don't have any bucket issues, or your search isn't worth the paper it is written on -- not sure which.

I'm not sure where the truth lies yet, but I am guessing we must have some bucket issues somewhere given the amount of data we pump each day.

More testing required I think.

thank you!

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Us neither could see any issues with previous search, but there are still couple of corrupted buckets (e.g. journal.gz was only couple of bytes).

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Would you provide more detail on how you identified the buckets were corrupted? That might add color into an existing way to be notified.

0 Karma

jamesoconnell
Path Finder

There was an exclamation symbol / warning on the Dashboard with some cryptic message saying there was an error related to the indexer in question: "[indexer_] Streamed search execute failed because: JournalSliceDirectory: Cannot seek to rawdata offset 0 ..."
This type of error scares the crap out of users and they freak-out to the admin...

0 Karma

bheemireddi
Communicator

If you are using "monitoring console" that would be a good starting point. It has the visibility into monitoring Indexer clustering activities. Below link might get you started, these are all the dashboards/searches, so may be you can setup the alerts on them. Also on the cluster master settings->indexer clustering might give you some insights too.
https://docs.splunk.com/Documentation/Splunk/6.6.2/Indexer/Viewindexerclusteringstatus

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...