Getting Data In

variance betweeen _time and date_* fields

kbrady
Explorer

I've got a situation where different date elements are providing inconsistent results for the same time data. I suspect this is a result of index time vs. search time processing and timezone differences between the data presented and data indexed. The data contains a date in UTC, which is converted to MDT by Splunk. The date_* data is incorrect when compared to the _time field, though is correct for the raw timestamp data.

If this activity were consistent amongst all system provided time fields, I could live with it pretty easily, but the fact that the data presented in the time field differs from the date* fields is problematic. Is this a bug or expected behavior?

Here's info on the search, data returned, props config, and splunk version

Search modifiers:
| eval wd=lower(strftime(time,"%A")) | table _time, date*, wd

Raw time data:
2013-08-16T05:10:05

Data presented:
_time==8/15/13 11:10:05.000 PM
wd==thursday
date_wday==friday
date_mday==16
date_hour==5

props.conf config:
[my_sourcetype]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 500
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE = 100000
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
TIME_PREFIX = ,\"start\":\"
TZ = UTC
pulldown_type = 1

Splunk version:
5.0.4, build 172409

1 Solution

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

View solution in original post

halkelley
Path Finder

I think your macro needs to have:

        eval  date_wday = strftime(_time, "%A")

for date_wday (capital "A") to work correctly

0 Karma

lguinn2
Legend

Yes, "%a" gives the DOW abbreviation, not the full name of the DOW.

lguinn2
Legend

This is expected behavior and is described in the manual here. The date* fields are a direct parsing of the text timestamp in the input stream. No timezone is applied to them - in any version of Splunk. These fields do not even exist for inputs that do not use text timestamps; for example, none of them will appear in an input that uses Linux epoch time.

_time is what I call the "normalized" time field. It is considered the true time that the event occurred and always takes timezone into consideration. All events have a _time field in Splunk, even if the input stream has no timestamp at all.

If you want to do accurate datetime arithmetic, IMO you should never use the date* fields. Instead, you can extract the necessary information from _time, as you did in your example search. For example

yoursearchhere
| eval weekDay = strftime(_time,"%a")
| eval HourOfDay = strftime(_time,"%H")
| table _time, weekDay, HourOfDay

You can also use the relative_time function to do some cool date arithmetic. For example, this calculates the beginning of the month for each event, and formats the times nicely:

yoursearchhere
| eval BeginningOfMonth = strftime(relative_time(_time,"@mon"),"%x %X")
| eval EventTime=strftime(_time,"%x %X")
| table EventTime, BeginningOfMonth

lguinn2
Legend

You could create a macro to help your user base. I'd call the macro something like normalize_datetime and have it calculate/override the existing date* fields. This macro could also be used on data that has no date* fields, providing an added benefit.

eval date_hour = strftime(_time, "%H") | eval date_mday = strftime(_time, "%d") | eval date_minute = strftime(_time, "%M") | eval date_month = strftime(_time, "%m") | eval date_second = strftime(_time, "%S") | eval date_wday = strftime(_time, "%a") | eval date_year = strftime(_time, "%y") | eval date_zone = "UTC"

lguinn2
Legend

Update: you may find that these fields do exist for inputs that do not use text timestamps. So far, I have not figured out how to remove the date_* fields from automatic field extraction - and I have looked... I completely understand the comment "some of my user base will have a harder time"

0 Karma

kbrady
Explorer

Fair enough, at least it's documented. Not a big deal for me to stop using the date_* fields, but some of my user base will have a harder time.

Thank you for a solid explanation.

Get Updates on the Splunk Community!

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...

Get ready to show some Splunk Certification swagger at .conf24!

Dive into the deep end of data by earning a Splunk Certification at .conf24. We're enticing you again this ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Now On-Demand Join us to learn more about how you can leverage Service Level Objectives (SLOs) and the new ...