I see below errors in the search head cluster.can some one helps resolve the issue?
02-11-2020 13:59:26.997 +0000 WARN ArtifactReplicator - Replication connection to ip=10.164.196.166:8999 timed out
02-11-2020 13:59:26.997 +0000 WARN ArtifactReplicator - Connection failed
02-11-2020 13:59:26.997 +0000 WARN ArtifactReplicator - event=artifactReplicationFailed type=ReplicationFiles files="/opt/splunk/var/run/splunk/dispatch/splunktemps/send/s2s/schedulerpbasav_ZWVfc2VhcmNoX3NwbHVua19zdXBwb3J0_RMD59b3a79690728a412_at_1581429480_498_638683B3-25D9-4D2A-AF2E-4E43362FDBFA-644D578C-F001-4711-B459-2338E22DF399.tar" guid=644D578C-F001-4711-B459-2338E22DF399 host=xx.xx.xxx.166 s2sport=8999 aid=746. Connection failed
we see some of the reports are generating without data and only some time..not sure what it is causing?
In your SHCluster status are all the SHC members up?
Do you see any errors from host=xx.xx.xxx.166 in index=_internal
i see similar errors in host 166..
ArtifactReplicator - Replication connection to ip=xx.xx.xxx.164:8999 timed out
02-11-2020 10:59:10.203 +0000 WARN ArtifactReplicator - Connection failed
02-11-2020 10:59:10.203 +0000 WARN ArtifactReplicator - event=artifactReplicationFailed type=ReplicationFiles files="/opt/splunk/var/run/splunk/dispatch/splunktemps/send/s2s/scheduler_c3ZjX3N1bW1hcmlzZXI_ZWVfc2VhcmNoX2V4cG9zdXJlbGF5ZXI_RMD586fb6099b3dba69f_at_1581418680_15799_644D578C-F001-4711-B459-2338E22DF399-638683B3-25D9-4D2A-AF2E-4E43362FDBFA.tar" guid=638683B3-25D9-4D2A-AF2E-4E43362FDBFA host=xx.xx.xxx.164 s2sport=8999 aid=928179. Connection failed
Do you have any large lookups (or any other files) which are being replicated?
It may be that the artefacts are too large to replicate in a timely fashion?
If you have large files :
1.) do they need to be in the app? Some apps include other TAs or other installers which should be removed or blacklisted. (Update them on the deployer)
2.) If your lookups are large, can you move them to KVstore and remove the csv?
Lookups are not too large...
als i see this error
02-11-2020 03:04:18.222 +0000 INFO SHCSlave - event=SHPSlave::handleReplicationError aid=scheduler_ZXhwb3N1cmVsYXllcl9yZXBvcnRpbmc_ZWVfc2VhcmNoX2V4cG9zdXJlbGF5ZXI__RMD521a333d1d9f1f391_at_1581390180_12973_644D578C-F001-4711-B459-2338E22DF399 src=644D578C-F001-4711-B459-2338E22DF399 tgt=638683B3-25D9-4D2A-AF2E-4E43362FDBFA failing=644D578C-F001-4711-B459-2338E22DF399 queued replication error j
Even iam facing with the same error could anyone Please suggest resolution for this?
yes ... all cluster mambers are up