Refine your search:

We have a third-party application that uses HTML formatted logs; we cannot change this. The data we want to use is defined in a table. I cannot figure out a way to use field extractions to pull this data, but this is a weak area for me (for now). What would you suggest to pull this data from the logs?

asked 24 May '12, 12:44

rgcurry's gravatar image

rgcurry
351139
accept rate: 25%

Can you please post a sanitized example? That would certainly help us help you.

(24 May '12, 12:51) araitz ♦

I have requested that info from the primary contact for this application group. Will post as soon as I get it.

(24 May '12, 13:22) rgcurry
1

Here is an example from the HTML formatted log. We want to use the data from the Headers to be the keyword and the data from the rows as its value.

<table width="100%" cellPadding="4" cellSpacing="0" align="right" style="table-layout:fixed;word-break:break-all;border-width:1pt">
<tr bgcolor="gray">
<td width="10%" style="color: Yellow"><b>Date<br>and Time</b></td>
<td width="20%" style="color: Yellow"><b>Thread</b></td>
<td width="8%" style="color: Yellow"><b>Login</b></td>
<td width="7%" style="color: Yellow"><b>IP</b></td>
<td width="5%" style="color: Yellow"><b>Type</b></td>
<td width="20%" style="color: Yellow"><b>Method</b></td>
<td width="30%" style="color: Yellow"><b>Message</b></td>
</tr>
<tr bgcolor="tomato"><td>Jan 18<br>18:48:36.018</td><td>WebContainer : 2</td><td>N/A</td><td>N/A</td><td>ERR</td><td>CAbsServlet.doPost(333)</td><td>Invalid request: Remote host: 10.175.226.11, Meta Data: [Function Name: GetBugValue, Login Session ID: 1339572, Project Session ID: 1029005, Call ID: 28]. Error: The session authentication has failed..</td></tr>
<tr bgcolor="tomato"><td>Jan 18<br>18:48:36.030</td><td>WebContainer : 2</td><td>N/A</td><td>N/A</td><td>ERR</td><td>CAbsServlet.doPost(353)</td><td>&nbsp<p>com.mercury.optane.core.CTdException<p>Messages:<br>The session authentication has failed.;<br><p>Stack Trace:<br>com.mercury.optane.core.CTdException: The session authentication has failed.<br>at com.mercury.td.tdserver.authentication.CLoginSessionDirectory.getItem(CLoginSessionDirectory.java:115)<br>at com.mercury.td.tdserver.authentication.CLoginSessionDirectory.getItem(CLoginSessionDirectory.java:94)<br>at com.mercury.td.web.CAbsServlet.assertRequestValidity(CAbsServlet.java:209)<br>at com.mercury.td.web.CAbsServlet.doPost(CAbsServlet.java:330)<br>at javax.servlet.http.HttpServlet.service(HttpServlet.java:763)<br>at javax.servlet.http.HttpServlet.service(HttpServlet.java:856)<br>at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1213)<br>at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1154)<br>at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:145)<br>at com.hp.qc.core.utils.gzipfilter.GZIPFilter.doFilter(GZIPFilter.java:30)<br>at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:190)<br>at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:130)<br>at com.ibm.ws.webcontainer.filter.WebAppFilterChain._doFilter(WebAppFilterChain.java:87)<br>at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:848)<br>at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:691)<br>at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:654)<br>at com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:526)<br>at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:90)<br>at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:764)<br>at com.ibm.ws.wswebcontainer.WebContainer.handleRequest(WebContainer.java:1478)<br>at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:133)<br>at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:457)<br>at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:515)<br>at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:300)<br>at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:271)<br>at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators(NewConnectionInitialReadCallback.java:214)<br>at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete(NewConnectionInitialReadCallback.java:113)<br>at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)<br>at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)<br>at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)<br>at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:136)<br>at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:196)<br>at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:751)<br>at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:881)<br>at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1551)<br></td></tr>

NOTE: I tried to paste this code in so that it would display but the whole table does not display. To see the rendered code, you will need to copy and paste into a file to feed to your browser. If anyone knows how to make the whole table display here, I'd like to know the way to make it so.

(25 May '12, 06:46) rgcurry

One Answer:

Sorry it took me so long to get back to this!

props.conf:

[your_sourcetype]
KV_MODE=none
SHOULD_LINEMERGE=True
BREAK_ONLY_BEFORE=^\<table
DATETIME_CONFIG=CURRENT
REPORT-12312=headers,row,values

transforms.conf:

[headers]
REGEX=\<td.*?Yellow\"\>\<b\>(.*?)\<\/b\>\<\/td\>
FORMAT=field::$1
MV_ADD=true
REPEAT_MATCH=True

[row]
REGEX=(?m)\<tr\sbgcolor\=\"tomato\"\>(.*)\<\/tr\>
FORMAT=row::$1

[values]
SOURCE_KEY=row
REGEX=\<td\>(.*?)\<\/td\>
FORMAT=value::$1
MV_ADD=true
REPEAT_MATCH=True

This search will yield a multi-valued field called 'key_val' where the first value will be:

"Date<br>and Time,Jan 18<br>18:48:36.018"

sourcetype=your_sourcetype | eval key_val=mvzip(field,value)
link

answered 06 Jul '12, 12:18

araitz's gravatar image

araitz ♦
7.9k3925
accept rate: 46%

I really appreciate your time on this. I am busy right now with a migration of my Splunk environments to a new platform and will get back to this as either time allows (I may have a delay between the TEST and PROD migrations) or after these are complete. Reading over this, I see this just might do the trick. Again, thank you for sharing your expertise.

(11 Jul '12, 07:08) rgcurry

No problem, we are here to help! BTW, you should be able to use the replace search command to get rid of or swap out the <br> with spaces.

(11 Jul '12, 08:52) araitz ♦
Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×267
×48
×28

Asked: 24 May '12, 12:44

Seen: 9,662 times

Last updated: 29 Jan, 10:30

Copyright © 2005-2012 Splunk Inc. All rights reserved.