Getting Data In

Alerting on forwarder up/down events

Lowell
Super Champion

I have a saved search that notifies me when a forwarder goes up or down based on various TcpInputProc and TcpOutputProc messages coming from both the indexer and the forwarder machines.

The problem I'm running into is that I'm seeing a bunch of messages like this even when the forwarder is not going down. Anybody know why this is happening, or a more reliable message that I can use for this:

05-26-2010 11:24:18.346 INFO TcpInputProc - Hostname=host.domain.com closed connection

I'm wondering if this simply means that the connection was temporarily closed due to no data, but that would seem odd since I'm seeing this primarily on a few servers that are fairly busy.


For anyone interested. My full search runs ever 5 minutes, and looks like this: (Be prepared to do some scrolling)

 index=_internal sourcetype="splunkd" (TcpInputProc "closed connection" OR "Connection accepted from") NOT localhost | eval sender=if(searchmatch("TcpOutputProc"),host,"") | eval receiver=if(searchmatch("TcpInputProc"),host,"") | eval action=if(searchmatch("Connect* accepted OR to"),"up", "down") | eval sender=coalesce(Hostname,sender) | rex "to (?<receiver>[^:]+)(:\d+)?" | rex "from (?<sender>\S+)" | replace "dnsname.example.com" with "splunk.domain.com", "anotherdnsname.domain.com" with "therealservername.domain.com" in sender, receiver | stats min(_time) as start_time, max(_time) as end_time, list(action) as actions, first(action) as final_state by sender,receiver | eval start_time=strftime(start_time,"%I:%M %p") | eval end_time=strftime(end_time,"%I:%M %p")
Tags (2)
1 Solution

Simeon
Splunk Employee
Splunk Employee

I recommend using the hosts metadata and searching for events received. Metadata contains when the last event was received from a specific host, source, or sourcetype. You can use a where statement that compares the last time an event was received to ensure that data is streaming. The reasons this is better than searching the splunkd log:

  1. You will know an event has actually come through
  2. The search itself is significantly faster

The search I use is as follows:

| metadata type=hosts | eval diff=now()-recentTime | where diff < 600 | convert ctime(*Time)

This will tell you what hosts have sent data in the past 10 minutes.

View solution in original post

Simeon
Splunk Employee
Splunk Employee

I recommend using the hosts metadata and searching for events received. Metadata contains when the last event was received from a specific host, source, or sourcetype. You can use a where statement that compares the last time an event was received to ensure that data is streaming. The reasons this is better than searching the splunkd log:

  1. You will know an event has actually come through
  2. The search itself is significantly faster

The search I use is as follows:

| metadata type=hosts | eval diff=now()-recentTime | where diff < 600 | convert ctime(*Time)

This will tell you what hosts have sent data in the past 10 minutes.

Get Updates on the Splunk Community!

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...