WARN ConfMetrics - single_action=PULL_FROM took wallclock_ms=4610! Consider a lower value of conf_replication_max_pull_count in server.conf on all members
What should I base the value on for conf_replication_max_pull_count? The warning is telling me that the cluster nodes are taking too long to pull configuration changes from the captain. Is my understanding correct?
conf_replication_max_pull_count = <int>
* Controls the maximum number of configuration changes a member will
replicate from the captain at one time.
* A value of 0 disables any size limits.
* Defaults to 1000.
Unless advised by Support, it's probably not a good idea to modify the conf_replication_max_pull_count setting.
The WARN itself is not necessarily a problem, unless it corresponds to slow UI response times and/or general system problems.
In general, note that this message is based on wallclock time. That means any performance problem on the system – e.g. memory pressure or contention for CPU – can cause this WARN. It isn't always a problem with the configuration replication workload itself.
If the WARN message corresponds to slow UI response times and/or general system problems, then please contact Support and provide the following artifacts for further analysis:
1.) Collect new diags from captain and from at least one of the member nodes
2.) On each of the search heads, please take of backup of the latest bundle file under var/run/splunk/snapshot to a temporary directory, and provide them as well
What is your cluster's ref factor?
replication_factor = 1
but that is only for replication of search artifacts
The WARN message is referring to configuration changes, like knowledge objects changing by users via the UI.