Solved: Re: How to improve the speed of Spunk search - Page 3

qazwsxe · ‎06-27-2019

I want to get hundreds of millions of data from billions of data, but it takes more than an hour each time.
I just used the simplest search: index="test" name=jack But, it's very slow.

Then I checked the memory and CPU usage. Each search takes only 200-300 MB of memory.
So I modified the max_mem_usage_mb, search_process_memory_usage_percentage_threshold and search_process_memory_usage_threshold parameters in $SPLUNK_HOME/etc/apps/search/local/limits.conf, but they didn't seem to play a significant role.
Is there any effective way to improve the speed of my search?
Thanks! 🙂

martin_mueller · ‎07-05-2019

You'll be much faster in finding Jack's company if you also specify how to find a company in your search. What that looks like depends on your data which you didn't share with us - knowing your data would help.

That could look like one of these:

index=foo sourcetype=company_register name=jack
index=foo category=employees name=jack
etc.

If you have an accelerated datamodel, it could look like this:

| tstats summariesonly=t values(company) as companies from datamodel=your_model where your_model.name=jack

To chain that you could build a dashboard with in-page drilldowns that steps through the tree you expect in your data.

View solution in original post

martin_mueller · ‎07-02-2019

Let me know when you're ready to stop trolling and want to tell us what you want to achieve.

qazwsxe · ‎07-02-2019

I just want to improve the speed of my search.I don't know what you mean!

amitm05 · ‎06-28-2019

@ qazwsxe
Simplest searches are the most vague searches as well sometimes. For best performances below are a few suggestions you can use to optimize the performance of your query.

Be precise with your base search as much as you can. It helps reduce the scope of search. Great help in case of larger indexes. So Include everything you know about the events you want.
Adding a sourcetype after your index will help narrowing the scope of searched events i.e. like

`index="test" sourcetype=<yourST> name=jack`

Keep your time range precise about your requirement.
If you working in an indexer cluster. Make sure your index data is being split across all indexers. This will help avoid the load on single indexer instance for a search.

And Lastly, do take a look at the job inspector for your searches and analyse where the most time is getting spent. They will be the areas to work upon.

Thanks.

amitm05 · ‎07-02-2019

you can save your query as a report and the accelerate it for the period you want. This can be an alternative to give you faster results at the time of your search but know that this would still cause extensive resource usage (only in the background i.e. at off times when you might not be searching).
You would also have to use a transforming command in your search to make it capable of acceleration.
Check this for more details -
https://docs.splunk.com/Documentation/Splunk/7.3.0/Report/Acceleratereports
https://docs.splunk.com/Documentation/Splunk/7.3.0/Knowledge/Manageacceleratedsearchsummaries

qazwsxe · ‎07-04-2019

I will try it ,thanks

qazwsxe · ‎06-28-2019

Even if the search conditions are accurate, the speed is slow.There's a lot of data, maybe billions,so the search speed is slow.
When the search command is executed, memory and CPU usage are minimal.Using the same data, ELK takes up a lot of memory and cpu, and it's much faster than splunk.
So, what should I do?
Thanks.

rvany · ‎07-01-2019

So what's your environment (regarding Indexer(s) and searchhead(s), storage)? And where are you examining the resource consumption?

qazwsxe · ‎07-01-2019

I just created an index called test. I view CPU and memory usage through task management.

rvany · ‎07-02-2019

My first question was: what is your environment? Do you run Splunk in a standalone installation? What kind of storage do you have attached?

qazwsxe · ‎07-02-2019

My test environments are win10 and Ubuntu 16.I run Splunk in a standalone installation.My storage is SSD,and there's a lot of free space

rvany · ‎07-02-2019

Ok, this way we could get further. Should have asked earlier, but: are you using virtual machines? What are the characteristics of your servers/VMs regarding CPU, RAM. I assume you have one (i.e. "1") SSD in your environment?

qazwsxe · ‎07-02-2019

I use physical machines. I am not sure the information of CPU.It's for server,so its performance will not be weak.And the RAM is 240GB.Finally,you're right,thereis one SSD.

rvany · ‎07-02-2019

For the Linux machine you could use cat /proc/cpuinfo for getting processor information. The interesting figures are "number of cores" and "cpu speed". Regarding the one SSD: Splunk recommends 800IOPS, which one SSD probably isn't able to deliver. So this may be your bottleneck.

In your first post you wrote "hundreds of millions of data from billions of data" - what is this in GB? What is this as event count?

Still there's not sure, what you want to achieve with your query. As @martin_mueller pointed out - there's not much sense in just displaying "hundreds of millions of data". So you probably want to do something else with your data. Would be very helpful - if not essential - to get an idea of what you're planning to do?

qazwsxe · ‎06-28-2019

@martin_mueller♦

niketn · ‎06-27-2019

@qazwsxe one of the best way to index and monitor KPI information would be to use Metrics Index available from version 7.x. With each release there is significant new features introduced to Metrics Indexing, so do explore the latest version and features (like current latest version 7.3.0 introduces Metrics Rollup). https://docs.splunk.com/Documentation/Splunk/latest/Metrics/Overview

https://www.splunk.com/blog/2018/05/16/metrics-to-the-max-dramatic-performance-improvements-for-moni...

https://www.splunk.com/blog/2019/06/18/navigating-data-chaos-with-splunk-metrics-workspace.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

How to improve the speed of Spunk search

Everything Community at .conf24!

Index This | I’m short for "configuration file.” What am I?

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...