Our developers tend to use syslog, um, carelessly. For example, one server yesterday decided to send out 1000 identical msgs per second to let us know its DB instance was down. By the time it was taken care of, our license was busted on that indexer, again. Too many violations this month, so we're down hard.
I'm thinking of crafting a scheduled search like so:
Based on that search, I'd like to set-up an alert script that would grab the offending servers' IPs, run "iptables -I INPUT 1 -s $IP -j DROP", and send out an email/snmp-trap that this has occurred.
However, with a distributed environment this task grows a little in complexity. Schedule the search on every indexer? Or only on the search head, and then make the script capable of sending the iptables commands to the indexers? Neither solution seems ideal.
So how do you deal with the occasional big spike in traffic? I'm trying to avoid manual intervention because we often get these spikes in the dead of night and I like to sleep.
asked 28 Apr '11, 09:59
I would recommend using Splunk's internal metrics for this:
Then save and schedule this search to run over your desired time window. Set the alert to trigger when MB>50 and trigger a script. The script will be responsible for taking the hosts identified by the search, running iptables, and sending an email/trap.
You could have the Splunk alert handle the email part as well depending on the manner by which you want to notify.
answered 28 Apr '11, 10:29
Before 4.2 it was messy.
But, now that I have been using the Deployment app in 4.2 its been a breeze. I specifically use the "Forwarders Sending More Data Than Expected" search with an alert set to fire when any forwarder hits 20% over of its "Average Daily KBps" .This search uses the
answered 28 Apr '11, 10:26