Getting Data In

Teach Splunk 4.2 to identify positions in a CSV file

MichalZ
Engager

Hi,

I need Splunk to index data on software distribution logs. Logs are created from data gathered from few sources by a shell script. One log is created for one day.


Log name example: GCKD-20110304.csv

Log name convention: GCKD-yyyymmdd.csv


Log content example: 1722383;winxp;MS10-034;xx-x-xxxxxxx;SUCCESSFUL;2011.03.04

Log content convention: DistroID;OS;patch;EndPoint;State;Date;Time


DistroID - 7-digit distribution ID

OS - for which type of Windows is the patch specified (two values: winxp or win7)

patch - name of M$ patch (MSXX-XXX)

EndPoint - receiving machine - 15 characters

State - distribution state: SUCCESSFUL; FAILED; EXPIRED; etc

Date - yyyy.mm.dd format

Time - hh:mm:ss


Can anyone tell me how to configure Splunk to use distribution time and date as correct timestamps. And that source host is the EndPoint name? And how can I configure input to have more advanced reporting capabilities (like teaching Splunk the names of each csv field to build good looking reports)

Tags (2)

ftk
Motivator

First you will want to make sure you assign a sourcetype to these logs. In your inputs.conf add sourcetype=distribution_log for example.

Next, in props.conf and transforms.conf, set up your field extractions as well as your host configuration.

Let's do the CSV fields first: transforms.conf:

[extract-distribution-fields]
DELIMS = ";"
FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"

And apply the extraction in props.conf:

[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields

To replace the host value with EndPoint, in transforms.conf:

[extract-distribution-host]
DEST_KEY = MetaData:Host
REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
FORMAT = host::$1

and again apply the extraction in props.conf:

[distribution_log]
TRANSFORMS-extract-host = extract-distribution-host

So all together your config files could look like this: transforms.conf:

[extract-distribution-fields]
    DELIMS = ";"
    FIELDS = "DistroID","OS","Patch","EndPoint","State","Date","Time"

[extract-distribution-host]
    DEST_KEY = MetaData:Host
    REGEX = ^\d+;[^;].*;[^;].*;([^;].*);
    FORMAT = host::$1

and props.conf:

[distribution_log]
TRANSFORMS-extract-header = extract-distribution-fields
TRANSFORMS-extract-host = extract-distribution-host

b4ggio
Explorer

The Time is configured within the /opt/splunk/etc/apps/<>/local and the file props.conf, you may have to create this file and choose your stanza and date time methods.

Its all explained here http://www.splunk.com/base/Documentation/latest/admin/propsconf if you search or scroll to "Timestamp extraction configuration"

To teach splunk to recognise the files and to pull the information through in a report you should look at field extraction as you can write a regex that names each field based on the delimeter being a semicolon and the choose the field number. Fortunately Splunk will also do this for you if you use the extract field wizard.

Hope this helps.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...