I've created a custom command in python that needs to view an entire set of events as a single batch, because it's comparing subsequent events. Unfortunately, Splunk is sending events to the custom command in chunks of <= 50,000 events. The commands.conf has streaming = false. Setting run_in_preview = false only changes the way the results are displayed, as expected.
In case it's relevant, the command is running on a search head which receives events from several distributed search nodes.
Here's the basic code -- run() is invoked by a minimal plugin "manager":
When invoked by a single splunk search, these results are generated:
Once the search is complete, only the 3 results from the last batch of events is shown.
For completeness, here's commands.conf:
So, is there any way aside from the settings in commands.conf to really convince Splunk not to stream events into a custom command? Maybe an intermediate command I could insert into the pipeline?
Well, either someone else can spot what's missing or can confirm that it's a bug, but for the time being an easy way to make sure no streaming events make it to your command is just to put a non-streaming command in front of it.
should do it.
That behavior doesn't seem right to me, but streaming=false was never intended to cause splunk to deliver all the events regardless of event quantity to the search command. To my understanding, it is supposed to influence how the search machinery thinks, and encourage it to only give one chunk to the search command.
Essentially, you could view this flag as "I'm only designed for small datasets".
In order to make your tool work over large datasets, you'll want to be streaming, and you'll want to be able to handle the data chunk by chunk.
For some problems that opens up an entire new topic about how you can efficiently store your state, and is it valid to emit nothing until the last call, and how do you know when it's the last call..
answered 12 Oct '11, 13:04