|
Are the symptoms below a bug in When readlog.py finds a large number in /tmp/seekposition and the audit.log file has been log cycled and is small, then readlog.py does not correctly rewrite the seekposition file: (print statements added in to readlog.py)
The |
|
Yeah, this is a bug, and I reported it as such to splunk support (case # 26091, SPL-20135) on March 11, 2009. I also included a patch (essentially a one line change) to fix the python script. The script has been updated since then, but it appears that the fundamentally flawed logic is still being shipped. I've long since replaced this script on my systems, so I haven't been paying much attention as to whether or not they've fixed it the released version. (I figure I did my part by reporting the issues and even providing the one line fix.) So for the sake of other feeling this pain, here's the fix I've come up with....
To correct this, I recommend simply changing this section of code:
To this:
In my mind, truncating the file (trimming off that 7th digit in your example) should minimize the risk of any kind of race condition. In other words, if you truncate the file right after you read it and if the script dies/aborts mid-way through reading the new input, then you end up re-loading the whole file rather than just a smaller chunk, this way the odds are pretty slim that anything like that will happen.) There is still a really small window between Of course, I also recommend you copy this script to your own custom app, since any inplace modifications will be overwritten during the next upgrade of the "unix" app (aka your next splunk upgrade.) Of course you still have the highly-probable situation that whenever your audit log rotates you will most likely miss the end of it because "monitor"ing files with an input script like this is not nearly as robust as splunk's default file monitoring process. This script make no attempts to read the remaining portion of your previous log file ( Since I'm already complaining, I should also point out that I don't like the default state file location, some OSes (like Ubuntu) actually wipe out
Base on some conversations with people in the know at Splunk .conf2010, it sounds like the Windows and Unix apps are going to be receiving a much needed overhaul in up and coming release, which is good. Because I've run into lots of little no-way-this-was-ever-tested kind of issues. This stuff drives me nuts. I've been assured that future quality of these apps will be much better. Many thanks for the helpful answer and comments
(24 Aug '10, 15:55)
bobhampton
Thank you for the comments. The "quality" of the UNIX app is a function of how many edge and corner cases we are aware of. So, please keep reporting any issues you find! Be sure to specify your UNIX flavor and version.
(08 Jan '11, 00:19)
V_at_Splunk
|
