Hi All,
I am trying to monitor HDFS files in splunk using hadoop connect app, HDFS directory structure changes everyday and splunk hadoop connect is not able to pull in new files into splunk. wondering if this can be achieved or I am missing something here ?
Using the Hadoop Connect Import function, make sure that the Resource name path is high level enough to deal with the daily changes. So if for example you have hadoop.example.com:8020/year/month/day/hour/ you can change it to hadoop.example.com:8020/year/
https://docs.splunk.com/Documentation/HadoopConnect/latest/DeployHadoopConnect/ImportfromHDFS
Also, you may want to look at Splunk Analytics for Hadoop as a way to monitor HDFS files
https://www.splunk.com/blog/2013/11/08/hunk-intro-part-3.html
https://www.splunk.com/blog/2014/05/14/hunkonhunk.html