Refine your search:

I'm running a somewhat large splunk installation that monitors syslog for >40k hosts. Every once in a while, a host goes crazy and starts logging as fast as it's network will carry it (SCSI errors, OOMKiller, etc).

Does anyone know of a way to alert in splunk on a host that exceeds a certain number of messages per minute? I'd like to kick off a script or email whenever one host goes over, say, 1000 mpm, but with so many different hosts I can't really create a search with the hostnames pre-defined.

Any thoughts?

asked 16 Jul '12, 07:46

rgisrael's gravatar image

rgisrael
535
accept rate: 0%


2 Answers:

Well the good news is that you don't have to predefine the hosts... that's what fields are for :-)

Create an alert for your search like this:

sourcetype=syslog | stats count by host

Schedule it to be run every minute, with a relative time span of:

earliest: -1m@m
latest: @m

with a custom condition to email you when:

where count > 1000

From there you might want to tweak your search to throttle subsequent notifications, but there's an example of how you'd do what you're after.

Hope this helps :-)

link

answered 16 Jul '12, 08:06

R.Turk's gravatar image

R.Turk
8003317
accept rate: 39%

edited 16 Jul '12, 20:41

Definitely looks like the right direction, but i get the following error message when I try to specify the custom condition:

"Encountered the following error while trying to update: In handler 'savedsearch': Cannot parse alert condition. Search operation 'count' is unknown. You might not have permission to run this operation."

I'm setting up this alert as the admin user, so permissions shouldn't be an issue

(16 Jul '12, 08:27) rgisrael

which strange admin disable the "count" command ? Maybe a typo error ?

(16 Jul '12, 09:35) yannK
1

Aha! It needs to be:

WHERE count > 1000

Thanks!

(16 Jul '12, 12:11) rgisrael

Yep, edited the answer accordingly (it was late when I did that one sorry!)

(16 Jul '12, 20:44) R.Turk

does this work on the free version?

will i be able to migrate alerts when i update from 3.x to 4.x?

(18 Jul '12, 10:34) jyanga

Another way is to use time buckets. (more flexible, because you can run other longer periods)

mysearch | bucket _time span=1m | stats count by _time host | WHERE count > 10000
will return all the hosts that had more than 10000 events per minute (and at when minute) and setup an alert condition on number of results > 0. attach the result to the email and you have all the details : _time host count

link

answered 16 Jul '12, 09:33

yannK's gravatar image

yannK
13.5k823
accept rate: 31%

Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×1,684
×132

Asked: 16 Jul '12, 07:46

Seen: 694 times

Last updated: 18 Jul '12, 11:11

Copyright © 2005-2012 Splunk Inc. All rights reserved.