I've seen a number of posts about this with varied responses.
Here's what I'm trying to do:
We have some web access logs that records load balancer hits every second. Those logs get sent (via Universal Forwarder) to the Splunk Indexer. I'd like to filter out those load balancer hits before they're indexed. Otherwise it's a waste of license.
I have been having a hard time getting this to work. I've read other posts on here with similar questions, but none of the solutions are working for me.
Is it true that this cannot be done with the Universal Forwarder? What exactly are the consequences of running the heavy forwarder?
Thanks!
Yes, it is true that you cannot filter events on a Universal Forwarder.
I wouldn't say it is a waste of license - you can filter these events at the indexer just as well. I would say however that it's a waste of bandwidth, which may or may not be an issue for you.
Running a heavy forwarder causes Splunk to take up some more resources because of the additional capabilities. I'll leave it to others to describe this in detail (I don't have any specifics myself), in the meantime this page in the docs covers some of the differences: http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Forwardercapabilities
Thank you for your feedback! I guess by "waste of license", I assumed that everything that was being forwarded was being indexed. I'll check out the doc you referred me to.
It's true that you can't filter the events out at the Universal Forwarder, but you can still do this.
Filtering of this kind will occur at the point at which the data is indexed, so you have two main options. The filtering configuration is pretty much the same either way (routing to nullQueue); it just depends on where you configure it.
a) Switch to a Heavy Forwarder
b) Perform the filtering at the central indexer
Personally, I'd go with option (b), especially if you have multiple web servers and aren't using the deployment manager.
A third option, sometimes good for remote sites, is to have a single heavy forwarder acting as a bridgehead, and have Universal Forwarder -> (Heavy Fowarder w/ filtering) -> Central Indexer.
In that case, your stanza header in props.conf would look like [source::.../access_servername.log]
. The rest should match up with what is in the documentation link given above.
The TRANSFORMS-xxx
line in props.conf tells Splunk which stanza in transforms.conf to look at, and the REGEX
entry in transforms.conf tells Splunk what text to match on.
Be careful to to specify a regex pattern that will match only on the load balancer hits, so that you don't discard legitimate traffic entries.
They do, actually.
The simplest way in that case would be to apply the filtering transform to a source
instead of a sourcetype
. Do the servers you wish to filter log to different filenames?
Thank you for the prompt reply!
I agree that "b" sounds better, especially for my environment.
I'm just unclear on how to implement it. How does the incoming forwarded data connect with the correct transform? I don't want to attach it to a sourcetype because other events from other hosts share the same sourcetype, and I don't necessarily want to filter them. (Does my question make sense?)
Thanks!