All Apps and Add-ons

Issue with livestatus - splunk.Intersplunk.getOrganizedResults() never returns a value

kuramanga
Explorer

I was trying to configure mklivestatus to work with Splunk For Nagios and discovered what I think is some kind of odd behaviour with the splunk.Intersplunk.getOrganizedResults() method.

I have mklivestatus working on the Nagios server and can see the data when using unixcat, I can also get data when using netcat on both the Nagios server and the Splunk server, I can also see the data if I write a simple Python script that grabs data from the mklivestatus instance (running via xinetd) with no problems.

However when trying to run any of the "live*.py" files within SplunkForNagios/bin they never are successful, and yes I have configured them to use the correct IP and port that mklivestatus is on.

I am also making sure to run the scripts with $SPLUNK_HOME/bin/python rather than the system Python binary.

After some digging I found that it is actually the call to splunk.Intersplunk.getOrganizedResults() that seems to cause the issue, it never returns a value.

It appears to be an issue with the while True loop within the Intersplunk module and method readResults().

Not quite sure exactly what is wrong, I've noticed the Python version that ships with Splunk does not have readline compiled in and yet the first line within the while loop calls input_buf.readline(), not quite sure how that is going to work without readline compiled in, but I also tried using a Python version with readline (copying the splunk modules out to it for testing) but this didn't help either.

Any ideas?

Tags (1)

babs101
Path Finder

Hi Luke I am currently having the same issue and I have followed this same idea of replacing src_host="syd1rtr01" to src_host="a known host from your nagios" I have even went to the extend of removing the src_host="syd1rtr01" completely and test and this also failed. I have made entries in input.conf to reflect index = nagios. I have sent you an email to this issue a while back and also published the Dashboard issue on this forum where the auto population seems not to work. If we replace the state of src_host="syd1rtr01 with a known server from our nagios then are we not setting the default value for the src_host as the named server. secondly is src_host not a variable of which the value it holds is subject to change?. The reason why I am asking this questions is base on the simple fact that we already has a Splunk head on-site that holds the default settings as src_host="syd1rtr01" and it display the auto populate but for the alert Dashboard as an example, you can not do "select a Hostname" cos there are no hosts in the pulldown menu.
Although I am not sure but this seems to be a bug in SplunkForNagios. Something I have also noticed in the SplunkForNagios is whilst there is manual for installation only there seems not to be manual for SplunkForNagios how can we find a manual.

0 Karma

lukeh
Contributor

Please ensure that your Splunk server can talk to MK Livestatus by executing a simple netcat command, for example:

root@homer:/opt/splunk/etc/apps/SplunkForNagios/bin# nc 10.10.10.10 6557 < nagios-hosts
name,address,alias,hard_state
bart,10.10.10.101,web server,0
lisa,10.10.10.102,database server,0

Note: replace 10.10.10.10 with the ip address of your nagios server running MK Livestatus

If you don't get a result using netcat then you should look at MK Livestatus - ensure that the ip address of your splunk server is listed next to "only_from" in /etc/xinetd.d/livestatus on your nagios server.

It is possible to update a python script to log errors, please refer to the following splunk answers for assistance:

http://splunk-base.splunk.com/answers/30535/any-advice-for-troubleshooting-scripted-lookups

http://splunk-base.splunk.com/answers/10283/python-scripted-lookup-doesnt-produce-any-results

All the best,

Luke 🙂

rjyetter
Path Finder

Hi Luke, along the same lines as this... we have made the appropriate changes to the python scripts and replaced the src_host with a valid host in nagios with all results showing 0 - where else should we look for problems? Is there a log somewhere that we can look at to troubleshoot?

0 Karma

vishalprofessio
New Member

Hey Luke- No need to investigate

I have changed the logic for "check-host-alive" into nagios and enable the active check for all the hosts . Now all status is UP into Splunk Livestatus Dashboard 🙂 🙂

Thank you very much for your help 🙂

Best Regards

Vishal

0 Karma

lukeh
Contributor

That is good news Vishal, and you're welcome.

Luke 🙂

0 Karma

vishalprofessio
New Member

Thanks Luke. one more help ;).

In Livestatus Dashboard some of hosts are showing Down but @nagios Dashboard all are UP. (Actually I am monitoring some host where ping is disable and i have configure passive check for those hosts into Nagios).

You have any idea what is the cause ? Is i have to change the logic @splunk side ?

Thanks as always

-Vishal

0 Karma

lukeh
Contributor

Hi Vishal 🙂

Glad you got the dashboard working 🙂

It is ok to use any valid hostname from your nagios configuration as the underlying lookup scripts require that the search result contains a field called src_host before performing the relevant MK Livestatus lookups.

ie. the dashboard is populated by 8 different python lookup scripts and each of them are executed after a successfully completed splunk search that contains src_host in the results.

So as long as you use a src_host that exists in the nagios logs which are indexed by splunk, the lookup scripts will perform their specific job.

Luke 🙂

vishalprofessio
New Member

Thanks Luke. I have very small setup of Nagios with 25 servers. My understanding Livestatus Dashboard will show "Total UP/Down (Total current Up, Down, & Unreachable hosts)". As you suggested
@NagiosLivestatus.xml File - I have removed all entries for src_host="syd1rtr01" to src_host="Hostname of one of my server which i am monitoring from Nagios ". Look like it start working 🙂 :). But i don't understand the requirement of src_host? can we put any hostname ? and it will monitor all UP/Down host status ?

Can you please correct me if my understanding is wrong.

Thanks

Vishal

0 Karma

lukeh
Contributor

Change the device name to any hostname in your nagios configuration. I would recommend that you choose the name of a host/device that is always up, eg. router/switch/server.

All the best,

Luke 🙂

vishalprofessio
New Member

NagiosLivestatus.xml and change the "src_host" name to a relevant device name in nagios.

What do you mean by relevant device name in nagios. I have tried to setup but LiveStatus is not working for me 😞

0 Karma

lukeh
Contributor

The python scripts that are included in Splunk for Nagios will only work from within the app, ie. they won't work from the command line.

Is the "Livestatus Dashboard" working? If not, you must edit NagiosLivestatus.xml and change the "src_host" name to a relevant device name in nagios.

Hope this helps,

Luke 🙂

Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...