index performance issue high latency

g_prez · ‎09-24-2011

Question:
I am seeing high latency on a lot of my source types in splunk
By high latency we are seeing it takes over 24 hours to index some events.
The SoS ( splunk on spunk ) app shows that most if not all the host sending messages to the syslog message file have high latency.
the server hosting splunk seems ok as it has 10 cpu and running on avg 5-10% usr processes and 1.8 io .

Does splunk multi thread indexing ? Or .my real question ... how do I get the latency on my events down to something more reasonable ... and not in the 24 hour range.

Simeon · ‎09-26-2011

Splunk should not fall behind unless:

Data does not show up
The forwarder is blocked/limited on sending data
The indexer is blocked/limited on indexing data

By default, lightweight forwarders limit data volumes to 256 KB per second. If you have full forwarders, you should not see this limit. I imagine that the forwarder or indexer is getting blocked somehow, or the data just never shows up. To see if it is blocked, run this search:

index=_internal source=*metrics.log blocked

To see if Splunk is not getting the data immediately, you can run the following search to find out when the data was indexed:

host=your_host sourcetype=syslog | eval indextime=_indextime | fields indextime | convert ctime(*time)

g_prez · ‎09-26-2011

Yep the indexer is getting blocked .. and like SOS was saying most are syslog soruces and syslog is being sent to a file ...
So I am not seeing IO issues on the box and one would think that if the indexer is being blocked it due to IO issue .. what would cause an indexer to be blocked ... volume ? yes if that is the case then the volume we are running is rather low for the hardware we have in place

g_prez · ‎09-26-2011

to add .. looking at the splunk reports on indexer activity splunk
the "CPU utilization by index-time processor in the last 1 hour"
chart shows a peek cpu load of 0.016% on the indexer process and that is the highest of the all the "splunk" processes.
Also I was way off on the indexer volume it is in the 8 gig per day range.

g_prez · ‎09-26-2011

standard opp ... syslogd dumping to the messages file and splunk montoring the messages file.
We also have about 5 heavy forwarders
Also this was a recent event .. the high latency .. furthermore .. only some of hosts in syslog have high latency some host do not .. it is strange. As stated it does not seem to be an io or sizing issue ... but will check the manual for sizing info just in case we missed something.

And finally the indexer is running about 15 gig per day

lguinn2 · ‎09-25-2011

How is the data getting to the server hosting splunk (the indexer)? Can you describe the topology (how many files being monitored, how many forwarders, how many MB/GB per day being generated)?

In a properly sized and configured system, indexing latency should be measured in seconds. Take a look at the first section of the Installation manual for sizing info.

index performance issue high latency

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms