Getting Data In

Charset Encoding

kenchisho
Path Finder

Hi guys,

I have installed Splunk 4.3 on a MAC OSX 10.7.

I am trying to index data with non utf encoding. I have tried pretty much every encoding available with splunk without any luck... the non unicode characters get replaced with some other symbols.

Example

in my log files i have "DAVOR ĆORIĆ" and it gets indexed as "DAVOR žORIž" or some other symnbol depending on which charset i use with this sourcetype... I never get the correct data indexed...

Has anyone had similar problem... and possibly a simple solution?

0 Karma

jbsplunk
Splunk Employee
Splunk Employee

Here is a list of supported character sets, and instructions on how to apply them to data:

http://docs.splunk.com/Documentation/Splunk/latest/data/Configurecharactersetencoding

jbsplunk
Splunk Employee
Splunk Employee

If you open the file with a tool like text wrangler, what does it detect as the charset? I've found that to be pretty reliable in troubleshooting these kinds of issues.

0 Karma

kenchisho
Path Finder

Hi jbsplunk,

thanks for the quick reply.

I have tried seting the charset manualy but splunk still garbles up the data when indexing. I have tried pretty much all the charsets available with splunk. Usualy with this type of data i use CP1250 and all goes well but with this set of data it is a no go with any charset config...

I have tried this with a linux install of splunk, thinking it might be an OSX related issue, and get the same results...

I am geusing this might be a bug but am not quite sure yet...

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...