Refine your search:

Hi All,

There is a set of webservers we are trying to index which have many virtual hosts on them. This is simple enough to add in apache by changing the LogFormat from

LogFormat "%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" combined
to
LogFormat "%V %h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"" vcombined

However this now breaks the magic that splunk used to do for parsing apache logfiles.

So I dug into /opt/splunk/etc/system/default/transforms.conf and found these lines

[access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:clientip]]s++[[nspaces:ident]]s++[[nspaces:user]]s++[[sbstring:req_time]]s++[[access-request]]s++[[nspaces:status]]s++[nspaces:bytes]?+)?[[all:other]]

and in /opt/splunk/etc/system/default/props.conf found this

[access_combined]
pulldown_type = true
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = [

I can see I just need to add a [[nspaces:vhost]]s to the transforms.conf entry but obviously dont want to mess with the defaults.

I tried to replicate what I saw in props.conf and transforms.conf into my own app but it just didn't seem to work????

my inputs.conf

[monitor:///etc/httpd/logs/access_log*]
sourcetype = vhost_access_combined
disabled = false
followTail = 0
host = development.server.com
index = webserver

my props.conf

[vhost_access_combined]
pulldown_type = true
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = vhost-access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = [

my transforms.conf

[vhost-access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: vhost, clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:vhost]]s++[[nspaces:clientip]]s++[[nspaces:ident]]s++[[nspaces:user]]s++[[sbstring:req_time]]s++[[access-request]]s++[[nspaces:status]]s++[nspaces:bytes]?+)?[[all:other]]

Any ideas how to get this working?

I have more complex questions to follow regarding having the host in splunk set to the value of vhost in the log entry but I will do this in baby steps first.

asked 05 Jul '11, 19:54

phoenixdigital's gravatar image

phoenixdigital
11216
accept rate: 50%

edited 05 Jul '11, 19:57

Here is an example log line

developer.management.theclient.rdev.com 192.168.31.108 - stingray [06/Jul/2011:12:33:21 +1000] "GET /pop.php?m=testimonial/edit&id=1 HTTP/1.1" 200 166 "http://developer.management.theclient.rdev.com/?m=testimonial/details&id=1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0"

(05 Jul '11, 20:00) phoenixdigital

2 Answers:

Ok I seemed to get it to work eventually using the following

inputs.conf

[monitor:///etc/httpd/logs/access_log*]
sourcetype = advanced_access_combined
index = webserver
disabled = false
followTail = 0
host = devserver.remora.com.au

[monitor:///etc/httpd/logs/error_log*] index = webserver disabled = false followTail = 0 host = devserver.remora.com.au

props.conf

[advanced_access_combined]
pulldown_type = true
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = advanced-access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = [

transforms.conf

[all_lazy]
REGEX = .*?

[all] REGEX = .*

[nspaces] # matches one or more NON space characters REGEX = S+

[qstring] #matches a quoted "string" - extracts an unnamed variable - name MUST be provided as in [[qstring:name]] # Extracts: empty-name-group (needs name) REGEX = "(?<>[^"]*+)"

[sbstring] #matches a string enclosed in [] - extracts an unnamed variable - name MUST be provided as in [[sbstring:name]] # Extracts: empty-name-group (needs name) REGEX = [(?<>[^]]*+)]

[bc_domain] REGEX = (?<domain>w++://[^/s"]++)

[bc_uri] # backwards compatible uri regex # uri = path optionally followed by query [/this/path/file.js?query=part&other=var] # path = root part followed by file [/root/part/file.part] # Extracts: uri, uri_path, root, file, uri_query, uri_domain (optional if in proxy mode) REGEX = (?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]](?<file>[^s?/]+)?)(?:?(?<uri_query>[^s]))?)

[reqstr] REGEX = [^s"]++

[access-request] # very relaxed regex for extracting fields from the request REGEX = "s+[[reqstr:method]]?(?:s++[bc_uri])?s*+"

[advanced-access-extractions] REGEX = ^[[nspaces:vhost]]s++[[nspaces:clientip]]s++[[nspaces:ident]]s++[[nspaces:user]]s++[[sbstring:req_time]]s++[[access-request]]s++[[nspaces:status]]s++[[nspaces:bytes]]s++[nspaces:req_process_time]?+)?[[all:other]]

It seemed I needed to copy alot of extras from the /opt/splunk/etc/system/default/transforms.conf which makes sense.

Another issue I encountered was that I have a primary index server and the apache files are being forwarded using a 'Universal Forwarder'

The whole thing did not work when props.conf and tranforms.conf were on the 'Universal Forwarder'. I needed to add them to the indexing server for the logfiles to be parsed correctly.

This is potentially going to be an issue as I would like to get the virtual host in the logfile to be marked as the Splunk host. The host is currently defined on the 'Universal Forwarder' in inputs.conf however I dont extract the virtual host until it hits the transforms.conf on the indexing server. I think by that time it will be too late to set the Splunk host. Anyway I will create a new question for that as it is out of the scope of this one.

Edit: The formatting rules here are useless when pasting in conf files so they are a bit munted. If someone needs the configs message me (if thats possible with splunkbase)

link

answered 06 Jul '11, 19:00

phoenixdigital's gravatar image

phoenixdigital
11216
accept rate: 50%

edited 06 Jul '11, 19:10

Universal Forwarder does not execute any parsing.

http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Typesofforwarders

link

answered 29 Feb, 08:58

oscarspaz's gravatar image

oscarspaz
101
accept rate: 0%

Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×37
×1

Asked: 05 Jul '11, 19:54

Seen: 995 times

Last updated: 29 Feb, 08:58

Copyright © 2005-2012 Splunk, Inc. All rights reserved.