I need help with a query to find the forwarders which stopped reporting for more than 2 weeks.
This is my search that I use track the last time a forwarder has responded.
| metadata type=hosts | eval age = now() - lastTime | sort -age | search age > 86400 | eval d=floor(age/86400) | eval h=floor((age-(d*86400))/3600) | eval m=floor((age-(d*86400)-(h*3600))/60) | eval s=floor(age-(d*86400)-(h*3600)-(m*60)) | eval age =d." days ".h." hours ".m." minutes ".s." seconds" | fields age,host | rename age as Time_of_Last_Response
travis.
This is my search that I use track the last time a forwarder has responded.
| metadata type=hosts | eval age = now() - lastTime | sort -age | search age > 86400 | eval d=floor(age/86400) | eval h=floor((age-(d*86400))/3600) | eval m=floor((age-(d*86400)-(h*3600))/60) | eval s=floor(age-(d*86400)-(h*3600)-(m*60)) | eval age =d." days ".h." hours ".m." minutes ".s." seconds" | fields age,host | rename age as Time_of_Last_Response
travis.
Thanks.This is what I was looking for.
I check my forwarder status with a shell script. That way I can check the indexer status also, and if I have a system which doesn't send logs very often, I can still make sure the forwarder is alive.
All of my forwarders on on Unix systems, so my script, invoked from cron every few minutes, does something like the snippet below.
I do take a short cut because all of my forwarder's PID files are available via NFS. However, you should be able to get the idea:
for splunker in `cat my_list_of_splunk_forwarders`
do
PIDList=`cat ${splunker}/var/run/splunk/splunkd.pid`
PIDCount=`print ${PIDList} | wc -w`
SplunkFwdr=`print ${splunker} | cut -d- -f1`
SplunkdCounter=0
for PID in `print ${PIDList}`
do
ProcName=`ssh ${SplunkFwdr} ps -o comm -p "${PID}" | tail +2`
if [[ ${ProcName} == "splunkd" ]]
then
SplunkdCounter=$((SplunkdCounter + 1))
fi
done
## There should always be at least two processes running:
if [[ ${SplunkdCounter} -lt 2 ]]
then
print "${splunker} -- splunkd count: ${SplunkdCounter}" | mailx -s "[${splunker}][Splunk] Forwarder Co
unt Mismatch [Found: ${SplunkdCounter}][Expected: ${PIDCount}]" ${ToList}
fi
done
If your on version 4 try a query such as this to see recently connected forwarders.
| metadata type=hosts | sort recentTime desc | convert ctime(recentTime) as Recent_Time