Solved: Memory usage increase in V5?

larryrosen · ‎10-31-2012

Upgraded from Splunk 4.3.4 on Windows server to V5. Since the upgrade the server is at a crawl and is often timing out trying to connect to the web interface. Was there a big increase in memory needs that I missed?

hexx · ‎11-08-2012

Hello,

We have determined that in Splunk 5.0, active UDP inputs cause the main splunkd process to leak memory. The rate of this memory leak appears to be proportional to the rate of data that is being received on the UDP input(s). For that reason, it is possible for a very active UDP input to cause splunkd to eventually exhaust all available memory on the host.

The bug that references this behavior is SPL-58075 and has been added to the list of known issues for Splunk 5.0.

We are actively working towards the release of a fix to this issue in the next few days.

In the meantime, there are four possible work-arounds that we can propose:

Install a 4.3.4 universal forwarder on the same machine that is currently receiving the UDP traffic and migrate the UDP inputs to that instance. The universal forwarder should be configured to send all incoming data to the indexer on the same host.
Where possible customers should switch from sending data via UDP to sending via TCP as it helps to reduce potential data loss and is more in line with best practices for sending network data to Splunk. Be advised that this input removes syslogd event augmentation (e.g. timestamp and hostname pre-pending)
Schedule a restart of the impacted instance(s) at regular intervals to prevent memory exhaustion. This solution is not strongly recommended as it can introduce data loss and kick users out of the system.
Disable UDP inputs. This solution is not recommended since all data from the sending host(s) will be lost.

View solution in original post

hexx · ‎11-08-2012

Hello,

We have determined that in Splunk 5.0, active UDP inputs cause the main splunkd process to leak memory. The rate of this memory leak appears to be proportional to the rate of data that is being received on the UDP input(s). For that reason, it is possible for a very active UDP input to cause splunkd to eventually exhaust all available memory on the host.

The bug that references this behavior is SPL-58075 and has been added to the list of known issues for Splunk 5.0.

We are actively working towards the release of a fix to this issue in the next few days.

In the meantime, there are four possible work-arounds that we can propose:

Install a 4.3.4 universal forwarder on the same machine that is currently receiving the UDP traffic and migrate the UDP inputs to that instance. The universal forwarder should be configured to send all incoming data to the indexer on the same host.
Where possible customers should switch from sending data via UDP to sending via TCP as it helps to reduce potential data loss and is more in line with best practices for sending network data to Splunk. Be advised that this input removes syslogd event augmentation (e.g. timestamp and hostname pre-pending)
Schedule a restart of the impacted instance(s) at regular intervals to prevent memory exhaustion. This solution is not strongly recommended as it can introduce data loss and kick users out of the system.
Disable UDP inputs. This solution is not recommended since all data from the sending host(s) will be lost.

richgalloway · ‎06-05-2013

We disabled the Splunk Deployment Monitor app and our memory consumption has been flat ever since.

---
If this reply helps you, Karma would be appreciated.

hexx · ‎05-13-2013

@richgalloway: Are you running the most recent 5.x version (5.0.2 as of now)? I would typically recommend to use the S.o.S app to track the resource usage of Splunk processes and establish a clear pattern. With that information, you'll want to open a support case to get this investigated further.

richgalloway · ‎05-13-2013

We're also running out of memory, but have zero UDP inputs configured. Could there be another source of the leak? We increased our VM from 8GB to 16GB and splunkd used it up in two days.

---
If this reply helps you, Karma would be appreciated.

kristian_kolb · ‎11-07-2012

What types of input do you have, and are you running any special apps? You should probably run a "splunk diag" and include the diag file to the support case.

/k

miteshvohra · ‎11-07-2012

I upgraded Splunk 4.3.4 to v5 and have a handful of event sources. Apart from restarting the service, the performance hasn't changed a bit. Response time is just the same as well.

yannK · ‎10-31-2012

yes splunk 5.0 uses more ressources :

more file descriptors are used in general
the clustering replication uses more disk i/o and generate network traffic, and require more storage
the transparent summarization implies ongoing backend searches to generate them, and add some volume to the buckets,

But in our case, it may also be a configuration issue, (deployment server, apps, backfill ...)

paxindustria · ‎11-07-2012

I'm seeing Similar issues on our Heavy Forwarders. I'll follow the above suggestions and see if that sheds any light!

jeklof · ‎11-07-2012

I am seeing this same issue on one of two indexers after the 5.0 upgrade.
Memory is leaking on one of them (SplunkD process) until fully consumed.

There are some configuration differences between the two. I will troubleshoot this issue some more.

mschroeder · ‎11-07-2012

Hi,
I also see a steady increase of memory usage on our search head since the upgrade to v5.0. The indexers and universal forwarders are running fine, though.

yannK · ‎11-02-2012

To investigate a memory leak,

please install the SOS app (or the TA-sos) on the indexer,
turn on the scripted inputs ps-sos.sh (on linux) or ps-sos.ps1 (powershell on windows, read the README for configuration instruction).
let it run several hours to one day, after a restart to collect the memory/cpu usage.
go to the sos dashboard for resource usage, and check which splunk processes are leaking memory.

Then create a support case and attach diag and screenshots, also precise if you turned on special features of 5.0 (like replication or search acceleration)

larryrosen · ‎11-02-2012

I have to agree something is causing rapid loss of memory (and not my turning 50 this December!).

We are seeing the same thing since 5.0

4gb system, not that many inputs.... Was running smooth on 4.3.4 but since upgrade exhausted all memory and froze.... Added 2gb extra RAM and while the server started up fine, within 1/2 hour all the ram had been consumed and it was unusable again.

cccrossw · ‎10-31-2012

we have had to reboot our V5 indexer twice in the last 24 hours. System crawls to a halt, and looking at the stats, the memory grows in a straight line until all is consumed, and the system effectively stops. We are only taking syslogs from 6 firewalls, and event logs from two windows boxes. The indexer has 8Gb of memory available. It looks like it might be a memory leak.

Memory usage increase in V5?

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes