Refine your search:

2
1

Is there a way to export the data that isn't correct then re-import it using the correct sourcetype? If not, is there another way to change the sourcetype after the data has been indexed?

asked 16 Apr '10, 22:38

Jaci's gravatar image

Jaci ♦
8722217
accept rate: 75%

edited 17 Apr '10, 00:42

jrodman's gravatar image

jrodman ♦
5.8k2515


2 Answers:

The easiest method is to wipe the data and reindex.

Wiping the data can be global (splunk clean eventdata -index myindex) or more focused (splunk search "some data | delete"). The full wrinkles of these methods are discussed elsewhere.


Another means is sourcetype renaming, if you want to alias an entire sourcetype to another one you can do this, by eg, in props.conf:

[wrong_sourcetype]
rename = right_sourcetype

This clearly doesn't work if your [wrong_sourcetype] is a valid sourcetype on its own.


It's also possible to dump a bucket to a csv format, manipulate that, and then generate a new bucket from the modified or filtered csv data. This is sort of, 'for wizards'.

The command to emit a bucket to csv is splunk cmd exporttool bucketname filename.csv -csv To generate a new bucket from the csv, you can use splunk cmd importtool new_bucket_dir filename.csv You will either have to manually assign the correct splunk name to the bucket_dir, for example by naming it the same as the original, or by using some kind of script to name it. I used the following shell fragment, where $bucket was the old bucket

bucket_id=$(echo $bucket | sed 's/.*_//')
(cd $NEW_BUCKET; ls *.tsidx | sed 's/-[0-9]\+\.tsidx$//' |sed 's/-/ /') | {
global_low=0
global_high=0
while read high low; do
    if [ $global_high -eq 0 ] || [ $high -gt $global_high ]; then
        global_high=$high
    fi
    if [ $global_low -eq 0 ] || [ $low -lt $global_low ]; then
        global_low=$low
    fi
done
REAL_BUCKET_NAME=db_${global_high}_${global_low}_${bucket_id}
mv $NEW_BUCKET $bucket_dir/$REAL_BUCKET_NAME

Once you have a newly constructed, duplicated bucket, you can remove the old one from your index and insert the new one.

The main problem with exporttool/importtool is that they're not all that optimized, so they consume a significant amount of ram, and a significant amount of cpu for a significant amount of time. We'll be making them faster, but for now you should probably be sure you have a certain amount of headroom on the box where you're processing them.

If you want to go down that path, the full script (treat as example) is stuck in the wiki over here: http://www.splunk.com/wiki/Community:Modifying_indexed_data_via_export_and_import

link

answered 16 Apr '10, 23:39

jrodman's gravatar image

jrodman ♦
5.8k2515
accept rate: 42%

edited 17 Apr '10, 01:02

No and no, once data has been indexed, that's the state it's going to stay in. An export/import capability has been requested on a number of occasions, but it's not built yet. If you want to change the 'sourcetype' value, all you can really do is re-index the data

If that's not possible, then the next best solution is to just use tags - http://www.splunk.com/base/Documentation/latest/Knowledge/Defineandusetags

link

answered 16 Apr '10, 23:50

Mick's gravatar image

Mick ♦
4.0k1327
accept rate: 52%

Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×326

Asked: 16 Apr '10, 22:38

Seen: 1,330 times

Last updated: 17 Apr '10, 01:02

Copyright © 2005-2012 Splunk, Inc. All rights reserved.