Within Splunk we have multiple indexes, but we have one that is exhibiting strange behaviour as the as the search time range crosses the creation time of the hot bucket.
If I run a plain search of:
With a time range of the last 30 days, within the Timeline we can see the pattern of event volume we expect to see over the least 30 days, for example with 21,780 events on January 6th.
If I now change to a custom time range of 6 January 2011 00:00:00 - 7 January 2011 00:00:00, the same query returns only 99 events per day, with the first event seen during the day at 22:15:13.247. A search for the previous day returns 0 events (as opposed to the expected 20K plus).
All days are blank for day-long searches back to 25th December 2010 (inclusive).
Just as looking back over the past 30-days yielding the correct results, if I run a search from the 24th December to 6th January it shows the expected results. If I change the start date to the 25th December (end date 7 January 2011 00:00:00) I get only the 99 results after 22:55:13 from the 6th January.
Under db on the index there appears to be only one warm bucket, and one hot. The creation time for both of these buckets is just after 23:00 on 6th January 2011 (note our data arrives in 5 minute batches, so the data from 22:55:00 - 23:00:00 which contains the last reliable data would have been available for Splunk to index just after 23:00).
On the 6th January I was doing a re-index of the data, and what I think has happened is that at the stage the warm database rolled to hot Splunk was monitoring the live file, but still hadn't caught-up on the history (had got to 24th December say). So that the hot bucket may contain data time-stamped prior to the timestamp of the data which prompted the roll from hot to warm.
Does this explanation make sense and is this a risk in Splunk: so if I'm re-indexing do I need to assure that all the historical files are indexed prior to the live files being configured/enabled in inputs.conf?