Refine your search:

0
1

hi... we are using splunk to look at indexed logs, at the same time, the googlemap add on is enabled to view query's origin. problem is, when we search using search query sourcetype="*" | geoip , supposably should give all the events information (which by the way is nearly 5 billion of events!!) but it only shows 19,000 events!! this is a disaster as geoip of splunk is really important to us. any clues what might be the cause of the problem and how to fix it???

asked 23 Dec '11, 20:57

nina15's gravatar image

nina15
814316
accept rate: 0%

I'm surprised no one has any answers for this post... does this mean that the geoip plugin has that huge problem and no one has ever fixed it??

(28 Dec '11, 18:42) nina15

thanks for the responses...

sourcetype="*" | geoip ???

How can that work. You need to pass a field like a src_ip or client_ip to that search so geoip knows what to graph. Whatever your IP address field is called, just use that in the search:

sourcetype="*" | geoip src_ip

if I want it to show all events that exists, what should my search be? because I need to get on the map counting all events.. one more issue... is this wrong? :

sourcetype="*" | geoip geoip_country_name

I get some errors when I perform this search...

(02 Jan '12, 18:00) nina15

ok so now this is my search query:

   * | geoip SourceIP

but again, it only shows me 19940 number of events. (as I said, total is nearly 5 billion!!) is it possible that geoip has some kind of limitation in displaying the results??

P.S: the results from * | geoip SourceIP are exactly similar to my own search method:

sourcetype="*" | geoip
(03 Jan '12, 01:10) nina15
1

Are you not seeing the new fields on the left, near the bottom when you run a search like this?

SourceIP=* | geoip SourceIP | timechart span=1d count(clientip_city) as "City Count" by clientip_city | rename clientip_city as City

(03 Jan '12, 03:50) dmaislin_splunk ♦

I play around with some larger firewall logs and observe the same behavior. geoip stops always after approximately 20000 events.

I will do some further testing later.

(03 Jan '12, 22:51) Spelunke

I'm unable to find the right paramter till now. geoip still stops working after around 20k events and 12 seconds.

(04 Jan '12, 13:28) Spelunke

thanks for your answers... yes I figured that and its been few days I'm also looking after it, still could not find the right parameter. under limits.conf it seemed the last parameter (max_count) to be the one, but no changes after changing the parameter takes effect...

(05 Jan '12, 22:52) nina15

and one other matter I cannot understand, is that the limits.conf file supposably should control all parts of splunk (since its located under system/default) but how come when I enter search query not using geoip, I can easily get billions of results, but the limit problem only occurs when geoip is being used in the search ??!

(05 Jan '12, 23:03) nina15

could this problem be from geoip's free license but not the limit.conf's parameters??

(11 Jan '12, 19:16) nina15

11 Answers:

12next »

The observed behavior is a postprocess-limitation of Splunk. When you take a look at the default maps view, you will notice that results are being post-processed. If you search through Splunkbase, youl'll find multiple discussions regarding the 10k postprocess limitation.

The results are summarized behind the scenes for the user. The module will automatically apply the following postprocess-search to the base-search:

eval _geo_count=coalesce(_geo_count,1) | stats sum(_geo_count) as _geo_count by _geo

So the results are aggregated to the count results by unique (distinct) location. The resulting number of records is usually lower by an order of a magnitude in most cases.

eg. when dealing with geo-ip database based results, there will not be a huge number of unique locations, since the number of records in the GeoCity Light database is not that big. A lot of IP addresses share the same location.

The GoogleMaps module will only fetch 100,000 results from the search endpoint. This is a hard-coded limitation at the moment, since the browser won't be able to handle more records at a time.

A better approach is to summarize the result in the base-search, by searching for something like:

sourcetype=something src_ip=* | stats count as _geo_count by src_ip | geoip src_ip | search _geo=* | stats sum(_geo_count) as _geo_count by _geo

Here's a short explaination what this search does:

sourcetype=something src_ip=*

Reduce the result in the base search to those events that contain the relevant IP field

| stats count as _geo_count by src_ip

Aggregate by distinct IP address

| geoip src_ip

Do the geo-ip lookup

| search _geo=*

Filter out those results that do not contain geo-information

| stats sum(_geo_count) as _geo_count by _geo

Aggregate again to the the summarized count of events by distinct location (ie. distinct combination of latitude and longitue).

If you're really dealing with a even bigger number of distinct locations (more than 100k), which I doubt, then you will need to perform some kind of server-side clustering. There will be support for accurate geo-clustering in a future version of the Google Maps app. In the meanwhile you can use the kmeans command or craft a custom search command.

link

answered 21 Feb '12, 11:28

ziegfried's gravatar image

ziegfried ♦
10.1k1618
accept rate: 52%

edited 27 Feb '12, 06:16

thanks alot ziegfried for the comprehensive, detailed answer... everyone here kindly aimed to help me with this problem and finally there is an answer... 2 thoughts though..

1: searching using your suggested search query does not fetch anything for me, it seems to be searching, but fetched results remains 0 and search percentage sticks to 46% for very long time (almost a day)...

2: your doubt is actually wrong. I do have nearly 1.5 millions of "distinct locations"...

(26 Feb '12, 17:59) nina15

I forgot the geoip command in the search.

You should take a closer look at the kmeans command to do server-side clustering of the results.

(27 Feb '12, 06:18) ziegfried ♦

ok, from what I understand, your comments are on Google Maps limit of 100,000: 1)Why splunk only goes up to 10k instead of 100k? if possible, how to modify that? 2)Why running geoip command in splunk's main search (the flashtime runner) also has the same issue although geoip command doesnt have anything to do with Google Maps?

(01 Mar '12, 20:30) nina15

Im trying sourcetype="*" | geoip SourceIP | kmeans k=100 SourceIP_country_name , its not giving me anything... any suggestions on the search command?

(06 Mar '12, 01:16) nina15

Im still waiting for an answer... Did I miss any answers here...?

(15 Mar '12, 23:22) nina15

sourcetype="*" | geoip ???

How can that work. You need to pass a field like a src_ip or client_ip to that search so geoip knows what to graph. Whatever your IP address field is called, just use that in the search:

sourcetype="*" | geoip src_ip

link

answered 30 Dec '11, 05:01

dmaislin_splunk's gravatar image

dmaislin_splunk ♦
1.1k26
accept rate: 20%

does this mean that the geoip plugin has that huge problem and no one has ever fixed it??

It's very unlikely that this is a major bug that has gone unnoticed until now. More likely it's a problem with your configuration.

Also remember that this is a community-based forum, that means a lot of people who read it are probably on vacation or busy with the holidays.

link

answered 30 Dec '11, 07:14

billmercer's gravatar image

billmercer
422
accept rate: 0%

The argument to geoip must be the IP you want to lookup. For example:

 * | geoip src

where src is the field containing the IP.

link

answered 02 Jan '12, 22:59

Spelunke's gravatar image

Spelunke
495
accept rate: 0%

You must pass the field name that contains the IP addresses after the geoip command or it will not work.

link

answered 03 Jan '12, 04:13

dmaislin_splunk's gravatar image

dmaislin_splunk ♦
1.1k26
accept rate: 20%

answering my own question in my previous post, the parameter "maxout" can solve the misery as its under [subsearch] stanza, which makes sence as geoip is a subsearch not a whole search... but anyhow, changing that parameter's value still did not help at all... no effects!!

link

answered 06 Jan '12, 00:46

nina15's gravatar image

nina15
814316
accept rate: 0%

1

In this context geoip is not a subsearch, which is why the limits.conf parameter you mention has no influence on it.

(30 Jan '12, 23:33) hexx ♦

noted and thanks. and I still am waiting to find out the source of the problem...

(31 Jan '12, 01:28) nina15

I think i found the source of the problems.. i think its neither google maps nor geoip! but actually any view or special searches rather than normal search!! to prove that im correct, you can try a simple search "*" which retrieves all info in the normal search, but then try it in "Advanced chart view" and u'll see again only less than 15-20 thousands of results will be shown!!! (this is while normal search goes up to few billions!) I think there is a bug in splunk's views or any kind of advanced searching for that matter... so i've started a new thread here: Bug? Splunk advanced searching/views does not display correctly

link

answered 11 Jan '12, 20:49

nina15's gravatar image

nina15
814316
accept rate: 0%

yes I do get new fields like SourceIP_city, SourceIP_country_code, SourceIP_country_name, etc... which is generated by geoip...

for the search u said, its taking for ever for it to complete (as I have near 5 billion events). this search is supposed to count per day, right?

im believing more and more now that my geoip configuration has some sort of limit up to 20000 in displaying results for example when I search the query

SourceIP=* | geoip SourceIP | timechart span=1d count(clientip_city) as "City Count" by clientip_city | rename clientip_city as City

it searches in all events, per day, and fetching millions of events, but since its daily, each day will be counted as 1 result overall hence it does not stop the search. but when I search this query:

SourceIP=* | geoip SourceIP

by itself, it stoppes at 19940...

link

answered 03 Jan '12, 18:56

nina15's gravatar image

nina15
814316
accept rate: 0%

edited 03 Jan '12, 18:58

ok I found another interesting support for the limitation fact I was talking about... I tried this query in Google maps view:

DestinationIP=* | geoip DestinationIP

It says 22,539 "matching events",

and above the map it says:

"10000 results with location information ( 1 distinct locations ) over all time"

and when I again tried

SourceIP=* | geoip SourceIP

I get 19,946 "matching events" (same as before)

and above the map says:

"9984 results with location information ( 1315 distinct locations ) over all time"

I said before that I think there is some limit working as a barrier here, and I said somehow its related to results, I think im confirmed here that its related to "results" number, not matching events... and seems to have a limit of "10000" ...

any ideas about this kind of configuration...?

link

answered 03 Jan '12, 20:35

nina15's gravatar image

nina15
814316
accept rate: 0%

In case you want to take a look at the limits, they are established on $SPLUNK_HOME/etc/system/default/limits.conf, find the one you'd like to change, create a new limits.conf and place under:

$SPLUNK_HOME/etc/system/local/limits.conf

link

answered 04 Jan '12, 05:08

dmaislin_splunk's gravatar image

dmaislin_splunk ♦
1.1k26
accept rate: 20%

Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions

RSS:

Answers

Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×1,651

Asked: 23 Dec '11, 20:57

Seen: 2,900 times

Last updated: 15 Mar '12, 23:22

Copyright © 2005-2012 Splunk Inc. All rights reserved.