Monitoring Splunk

Splunk speed performance

ArousOussema
Engager

Greetings,

I'm a student at the Hochschule Darmstadt in Germany. I'm currently working on a project for my university, where we’re trying to find a suitable Log management tool for our big data cluster. 

It will be so helpful if you can provide me with some information.

 

Equipment: 

 

We have 48 nodes :

 

28 x Dell PowerEdge C6220

2 Intel Xeon E5-2609 (4 Cores for each)

64 GB RAM

16 x 1 TB SATA 7.2 k

 

20 x Dell PowerEdge C6320

2 Intel Xeon E5-2620v2 (6 Cores for each)

128 GB RAM

16 x 1 TB SATA 7.2 k 

 

The nodes are connected with a high-bandwidth and low-latency network.

 

Every node generate for now 500 MB of logs daily, with the total of 24 GB logs daily 

 

The criteria we’re considering are as follows : 

 

  1. The log management tool should be able to process the generated logs within 10 seconds

Generation ⇒ arrival. This means, from the log source to the universal forwarder until to be ready for search in Splunk

       2.      Splunk UI interaction performs within 1 second

 

We can use 24 nodes in order to scale Splunk  in therfore accelerate the process.

 

Can Splunk meet these criteria? 

Are there any calculations we can do on the speed performance so if the log's quantity changed we can maintain response tau of 10 seconds? 

 

Your help is very much appreciated.

 

 
 
Labels (1)
0 Karma

cpetterborg
SplunkTrust
SplunkTrust

See System Requirements for more information on your hardware needs

If you intend to use the hardware you described in your post, then you may be disappointed, whatever tool you decide upon. 24GB/day is not much (we handle more than 1,000 times that on our clusters), but you may be disappointed with so few cores (you are talking about using a minimal requirement using your best hardware). The drive configuration is also suspect. It really depends on the number of IOPs you can get from them. Configuring them wrong would be degrading to the performance. If no one searches the data, you could probably keep up with the ingestion of the data, but searching AND ingesting will probably be slow and you won't be pleased. I used to manage a cluster that did 200GB/day with 3 indexers, but each one had 16 cores and less RAM. It did fine. Where I am currently we use 48 core machines with 256GB RAM with SSD drives for hot storage. The clusters have more than 100 IDXs in them (we have 5 clusters). 

The following information is given with the assumption that you are using effective server hardware:

Splunk is very capable at receiving the data, and data sent to it is 95% likely to be searchable with in 5 seconds. However, that is dependent on how the data is sent to Splunk, and how much data is going to the indexers (or at least the parsing layer you use). I would say that it meets the first criteria just fine.

As to the second criteria of getting UI interactions within 1 second, that is a very hard one to determine. It is dependent on a number of things

  • How good/bad is the search
  • How long is the time range over which you are searching
  • How much data is going to be returned by the indexers
  • How many of the fields are indexed at index time vs search time
  • etc.

Some searches will return within a second if you have all the stars aligned and the moon is full. But there are so many variables that the search is dependent upon that it is hard to say. That is the case with ANY log tool. Never assume that all interactions will be fast. I've seen really bad searches that never complete in a day (I usually see this from users who have no clue what they are doing and to all-time searches in a cluster with over 200 IDXs in an index that is over 200TB in size. These get killed and we have a serious talk with the user that does this.) Can you get fast interactions with Splunk? Absolutely. Can you mess it all up and get slow ones? Absolutely.

If you don't know what you are doing you can keep a Ferrari from going more than 10 miles an hour. The important thing is to learn all you can so that you don't keep it from performing optimally. 

Get Updates on the Splunk Community!

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...