|
I'm currently experiencing this: 1) Run a query that returns a large number of events (say, 1mil) 2) Save the job 3) Load the events using | loadjob events=t Step 3 only loads about approx. 150k events, whereas if I click on the link in the job manager directly it returns the full result set. Is this normal? If this is a bug, was this fixed in 4.1? |
|
For any given search, Splunk will only retain a limited number of raw events, i.e., actual event data pulled out of the index by the To illustrate the necessity for this, consider this simple example:
If your index contains 2 billion events that match the search, storing all 2 billion events every time you run that search would consume the entire storage system in no time. Generally speaking, the 2 billion row data set is not what you're after -- it's the summarized or transformed version that is of interest. Note that the limitation described here does not mean that Splunk cannot handle lots of events. The search language will process all events asked of it, but will abide by these practical safety controls and not cache all of the raw data. For perspective, search for a word like Assuming that this answer explains the question, then the parameter/value events=t for loadjob loses its meaning (or is misrepresented/misunderstood by me :P) since the description says "Loads the events that were generated by the search job..." Though there's a strong case for conserving disk space, I'd think that when a search is saved there's an implicit expectation to be able to reference (in the future) all the raw events in addition to the summarized/transformed data with the appropriate loadjob command. :)
(11 Apr '10, 11:46)
rayfoo
Could someone provide information about the actual settings name and stanza in the
(12 Apr '10, 14:03)
Lowell ♦
|
I can confirm this behavior in Splunk 4.1. I ran the search
* | head 200000, saved the job, and when I try to load it usingloadjob <sid> events=t, I only get 10,000 events. (I looked inlimits.confbut didn't see any settings in there)Did some digging in the
dispatchfolder, and now I'm really confused. If I look in theeventsfolder, there are a number of*.csv.gzfiles which, when added up only contain 10,000 events. But if I pull the job from the jobs manager, I can see all 200,000 events. So where are the other 190,000 events stored? I must be missing something.anyone? I can't offer a bounty as yet, since I don't have enough rep to give away :P