Other Admin

Lost one indexer on cluster

dieguiariel
Path Finder

Hi, on the weekend we had an electric problem on our secondary datacenter and currently dont have energy. One indexer was on that datacenter, it was on cluster with another indexer. 

There are any tasks that i must do in the meantime? the estimated recovery time for the datacenter is 3 to 4 days, maybe put the indexers on maintenance mode?

i've read this

https://docs.splunk.com/Documentation/Splunk/9.1.1/Indexer/Whathappenswhenapeergoesdown

but only talks about what happens with bucket fixing

Regards.

Labels (2)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Generally speaking, the properly designed cluster should continue to function properly. The primaries were reassigned when you lost your indexer, some replication might have been triggered to make your cluster complete, life goes on. When you get your power back on the indexer should rejoin your cluster and you should have surplus buckets (which you might be able to get rid of).

Question is whether your cluster was properly configured. Since you're talking about losing just one indexer in "another datacenter" and it being "on cluster with another indexer", that might as well mean that you had just two indexers and rf=sf=2 (if you had rf=sf=1, you're in trouble already).

So your cluster can be searchable but not complete and when the currently offline indexer rejoins the cluster, CM will trigger replication of all buckets ingested during the downtime.

0 Karma

_JP
Contributor

Assuming things are configured "correct" for what you expect a Splunk Cluster to be, I would assume you're fine for right now. For example, the Cluster Master probably has eyes on that missing indexer and is waiting for it to come up, and has been communicating to all of your Splunk deployment the necessary info like "Hey UF, don't send your data to this Indexer" and "Hey SH, this indexer is no longer a search peer."

Can you describe how you have your cluster configured - replication/search factors?  How many indexers do you have (e.g. is this one of two)? 

That could give us a hint on what you should expect when things do come back up.  For example, if that indexer comes up and says hello to your Cluster Master, then that CM is going to start doing any replication/balancing of buckets.  That's going to suck a chunk of your network pipe that you might not want when this data center comes up, so if that indexer is lower priority you leave Splunk shut down until your higher priority stuff in that data center is recovered.

0 Karma
Get Updates on the Splunk Community!

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...