Getting Data In

vix.input.1.et.regex - Log directory only contains year, month, and day. How are searches affected if there is no hour?

yahoohunk
Explorer

When I perform a query "index=api" with date range for example 07/07/2015 - 07/07/2015, I only get results within the first second of midnight on 07/07/2015.

But if I perform the same query with date range 07/07/2015 - 07/08/2015, I get more results from the 07/07/2015 day. It includes more than just midnight.

Does not having hours and minutes in the directory affect the search? In Splunk, using the 07/07/2015-07/07/2015 gets me all results of that day.

My HDFS logs are partitioned by year/month/day.

virtual index
[api]
vix.provider = testprovider
vix.input.1.path = /projects/test/test_logs/api/...
vix.input.1.accept = .
vix.input.1.et.regex = /projects/test/test_logs/.+/(\d\d\d\d)/(\d\d)/(\d\d)/.+
vix.input.1.et.format = yyyyMMdd
vix.input.1.et.offset = 0
vix.input.1.lt.regex = /projects/test/test_logs/.+/(\d\d\d\d)/(\d\d)/(\d\d)/.+
vix.input.1.lt.format = yyyyMMdd
vix.input.1.lt.offset = 86400

Tags (2)
0 Karma
1 Solution

suarezry
Builder

Typically the events themselves would have timestamps. Did you configure timestamp recognition?

Configuring timestamp recognition in splunk

For example, event contains:

2015-07-10T13:40:51Z syslog.tcp {"message":"<166>2015-07-10T13:40:51.076Z somehost.somedomain Vpxa: [FFEDEB90 verbose 'VpxaHalCnxHostagent' opID=WFU-c8713dc4] [WaitForUpdatesDone] Received callback","client_host":"10.6.50.104"}

props.conf:

[source::/my/source/...]
sourcetype = hadoop
priority = 100
ANNOTATE_PUNCT = false
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%SZ
TZ=UTC

View solution in original post

suarezry
Builder

Typically the events themselves would have timestamps. Did you configure timestamp recognition?

Configuring timestamp recognition in splunk

For example, event contains:

2015-07-10T13:40:51Z syslog.tcp {"message":"<166>2015-07-10T13:40:51.076Z somehost.somedomain Vpxa: [FFEDEB90 verbose 'VpxaHalCnxHostagent' opID=WFU-c8713dc4] [WaitForUpdatesDone] Received callback","client_host":"10.6.50.104"}

props.conf:

[source::/my/source/...]
sourcetype = hadoop
priority = 100
ANNOTATE_PUNCT = false
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%dT%H:%M:%SZ
TZ=UTC

yahoohunk
Explorer

I have not configured the timestamp recognition. Will try that and the timezone and see if it works.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Also, make sure to set the same timezone in indexes.conf for vix.input.1.et/lt.timezone - in many cases the timezone of the data is GMT while the search is ran in a user specified timezone, e.g. PST

0 Karma

jeffland
SplunkTrust
SplunkTrust

"Does not having hours and minutes in the directory affect the search" - which directory is that? If splunk determines the timestamp for your events from the directory structure, then of course those are needed (and splunk will give events a midnight timestamp if only day is available).
Could you clarify how you run the three searches you mention above, with 1) from 07/07/2015 - 07/07/2015, 2) from 07/07/2015 - 07/08/2015 and 3) "In Splunk", and some example timestamps from those results?

0 Karma

yahoohunk
Explorer

The directory and file looks like the following in HDFS.
/projects/test/test_logs/api/2015/07/07/api_server.log.2015-07-07.gz

A line in the log looks like
[Thu Jul 09 02:03:02 2015] [error] [client 127.0.0.1] log={"messages": "test"}

Ran the search index="api"
Smart Mode

Used the "Date Range" option with Between "07/07/2015" and "07/07/2015".
Time Column Results

7/7/15
12:00:15.000 AM

Event Column Results
[Tue Jul 07 00:00:15 2015] [error] log={"messages": "test"}
29,000 events
In the results listings, I don't see anything beyond 00:00

Used the "Date Range" option with Between "07/07/2015" and "07/08/2015".
Time Column Results
7/7/15
12:00:15.000 AM

Event Column Results
[Tue Jul 07 00:00:15 2015] [error] log={"messages": "test"}
76,000,000 events
In the results listings, I don't see anything beyond 00:00

Question: It says 76 mil events matched, but results list only hour 0.

0 Karma

jeffland
SplunkTrust
SplunkTrust

That looks like a problem with your timestamp recognition.

0 Karma
Get Updates on the Splunk Community!

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...

IM Landing Page Filter - Now Available

We’ve added the capability for you to filter across the summary details on the main Infrastructure Monitoring ...

Dynamic Links from Alerts to IM Navigators - New in Observability Cloud

Splunk continues to improve the troubleshooting experience in Observability Cloud with this latest enhancement ...