Apr 21 2009

Saturday Morning Zen

On Friday we got a nice amount of snow and during the night to Saturday we got even more. It was way too warm for it to stick long, but Saturday Morning was just gorgeous. Soon after I got up, I put the camera on the wall in front of the house and captured 35 mins of New Mexico Morning Beauty. I speed it up to just a bit over 4 mins. There’s not going to happen a lot in the video below; you’ll just see the sun gently running her fingers across the landscape … (make sure you select HD below otherwise it’ll look like utter rubbish):

Soundtrack is “november” by “raja_ffm” via ccMixter.org.

PS: wish you could see the big version on a big screen …


Apr 7 2009

Google Earth Forensics

[ Update 04/16/2009: The code for the tool is now available for download. Look at the end of this post … ]
A lot of log files on my system contain ip-addresses: sshd logs attack attempts (and successful logins), snort logs common intrusion tactics, apache logs errors and accesses, etc.

Using my new found love for Google Earth and some perl hacking, I created a little tool that allows me to monitor the log information above and put threats, errors and accesses on the map (literally!).

Here’s how it works: a perl script runs on a regular basis and scans a number of log files on my system for new information. If new information is found, it generates a KML file with placemarks that point to the location that is responsible for the log information. Say snort complains about a potential SQL injection attempt from address a.b.c.d, then the script will look up the location of a.b.c.d (again using Marc’s free ip2location database), add a placemark for it (with some details from the log files) and repeats the same sequence for other new log information. Everything is bundled up in a KML file, a tour is added and the stuff is shipped to Google Earth.

Inside Google Earth, I see the following under “Places”:

Places

There are 6 entries that sshd generated, one from snort and one from fail2ban. Here’s one example:

2009-04-07-geforensics

And here’s another one from the US:

2009-04-07-geforensics3

For each entry one can also get a traceroute, whois and black-list information. And to top it all off, there’s an animated tour feature that visits all the threats automatically (wish that KML would support auto-play and auto-repeat features).

The script can run from the command line and it’s output can be piped into a new kml-file like this:

$ perl geforensics.pl > foo.kml

Then take foo.kml and place it somewhere on a server or load it up directly in Google Earth.

The script also detects if it is running as a CGI. In that case it will use KML’s NetworkLink feature with refreshMode set to “onInterval” to refresh the placemarks automatically after a given period (I use 5 mins here). Pretty cool to see GE refresh the map automatically ;-)

And here’s a screencast that puts it all together (watch it in full-screen): I start the tour of incidents, then select a specific host in China, initiate a traceroute and while the traceroute is running I check if the guy is listed in some black-lists (he is), once the traceroute comes back I go to the first hop (my ISP), then over to the destination and check the WHOIS information for the ip-address.

And here’s a sample KML file if you want to see it for yourself (in Google Earth):
2009-04-07-gef.kml

The perl code for the little tool is now available at: 2009-04-16-gforensics.pl.gz (gzip compressed perl file – 4KB).


Mar 28 2009

Google Earth as a traceroute viewer

[Update 03/30/2009: After making sure that people won’t be able to bring down my server, you can try a live example, by clicking the following link (you will still need Google Earth on your computer): Trace the path from my server to your system ]

[Update 04/07/2009: If you like this post, you may also want to take a look at the Google Earth Forensics post, which is IMHO a lot cooler :-) ]

[Update 04/16/2009: I added a link to source code at the end of this post. ]

I played a bit more with the idea that I presented in “Apache access_log to Google Earth KML” and, I think, I came up with something extremely cool.

When you surf around on the Internet (or use any TCP/IP service), your data is being routed through an endless list of gateways. Your packets are hopping from one system to the next one, until the final destination has been reached. And data served by the remote destination is hopping it’s way back to you.

On Unix we have the traceroute utility (on Windows it’s called ‘tracert’) that allows you to figure out the route that your packets are taking to a remote destination.

In my web server’s access_log I can see a line where a host at the ip-address 67.195.37.190 accessed a certain url on my server. Resolving that ip-address to a name (via “host 67.195.37.190″) reveals that it is one of Yahoo.com’s crawlers. Using traceroute on the same address shows the following (I changed my internal network address to aaa.bbb.ccc.ddd below):

$ traceroute -q 1 -e -I 67.195.37.190
traceroute to 67.195.37.190 (67.195.37.190), 64 hops max, 60 byte packets
 1  netgear (aaa.bbb.ccc.ddd)  0.781 ms
 2  208-3-81-1.cnsp.net (208.3.81.1)  12.871 ms
 3  144.223.172.81 (144.223.172.81)  35.880 ms
 4  sl-bb20-ana-0-0.sprintlink.net (144.232.1.241)  54.828 ms
 5  sl-crs2-ana-0-13-5-0.sprintlink.net (144.232.1.177)  40.534 ms
 6  192.205.33.189 (192.205.33.189)  39.042 ms
 7  cr1.la2ca.ip.att.net (12.122.128.14)  54.288 ms
 8  cr1.sffca.ip.att.net (12.122.3.121)  51.418 ms
 9  12.122.137.97 (12.122.137.97)  45.766 ms
10  12.86.154.18 (12.86.154.18)  49.140 ms
11  so-1-0-0.pat1.swp.yahoo.com (216.115.110.43)  72.021 ms
12  as0.pat1.gqb.yahoo.com (216.115.96.45)  73.443 ms
13  xe-5-0-0.msr1.gq1.yahoo.com (66.196.67.1)  73.514 ms
14  xe-8-0-0.clr2-a-sat.gq1.yahoo.com (67.195.0.19)  76.818 ms
15  te-6-0.bas5-2-con.gq1.yahoo.com (98.137.31.34)  79.453 ms
16  llf320059.crawl.yahoo.net (67.195.37.190)  76.748 ms
$

This output tells me that it takes roughly 80 milliseconds to get packets from my system to the final destination at Yahoo. However, it does not tell me what geographical path my packets take.

Hold on to your socks, because here’s the same path after I ran it through my little traceroute visualization tool (displayed in Google Earth):

Traceroute in Google Earth

The tool I created this morning, will automatically run a traceroute to any ip-address and it will create a Google Earth compatible KML file. Again we are using Marc’s free database to map ip-addresses to locations on the map. For each hop it records the hop’s ip-address, name and, if available, location. The tool also creates a tour that allows to jump from hop to hop in an animated fashion, until you arrive at your final destination where even more information (Whois) is displayed.

When you try one of the sample files below in Google Earth, just double-click the “Animate Route (play me)” item in the “Places” area:

Animate Route

As you jump from hop to hop, information about the current gateway is being displayed roughly in the geo location where that gateway is physically located (the free database mentioned above does have some hosts that are not mapped and I’m skipping those automatically).

And here are two KMZ files that you can download to Google Earth in order to see the stuff:

“Complete” above means that I was able to trace the route all the way to the destination host. And “Incomplete” means that I aborted the trace after a number of systems along the path did not respond to my trace queries.

And if you want to see a live example, click the following link to see the path from my server to your system.

The perl code for the little tool is now available at: 2009-04-16-gtrace.pl.gz (gzip compressed perl file – 2.5KB).


Mar 25 2009

Apache access_log to Google Earth KML

Where are the visitors to kahunaburger.com coming from? I admitted in the past that I’m log-file-junkie. There’s usually a terminal window on desktop that runs multitail (or my own airlog) against a number of log-files on various servers. Line after line scrolls by as people hit those servers. The ip-address does not tell you too much about the visitor and I always wanted to see where those ip-addresses are located.

Just the other day I saw link to a free IP address geolocation SQL database (thanks Marc for making that one available). I downloaded the 11MB file and added the database to my mysql server.

Next, I created a simple perl script, that walks over my web servers (apache) access_log, extracts ip-addresses, access-date/time and url and finally converts all those items (using above mentioned database) into a KML file that can be fed to Google Earth.

The result looks like this in Google Earth:
2009-03-25-log2kml

And here’s the script that does the magic (it assumes that you have stored the database in “ipinfo”):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
use strict;
use DBI;
 
my %seen;
my $dbh = DBI->connect("dbi:mysql:ipinfo","username","password");
die "unable to connect to database" unless ($dbh);
# kml header
print qq{< ?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
\t<document>
};
# loop over access_log lines
while(<>) {
    # does it look like an access log entry?
    next unless (/^(\d+)\.(\d+)\.(\d+)\.(\d+).*\[([^\]]+)\]\s+"(\S+)\s+(\S+)\s+HTTP\/\d.\d/);
    my($a,$b,$c,$d,$date,$method,$uri)=($1,$2,$3,$4,$5,$6,$7);
    # make sure we have a good ip address
    next if ($a < 0 || $a > 255 || $b < 0 || $b > 255 ||
             $c < 0 || $c > 255 || $d < 0 || $d > 255);
    # did we see this IP already?
    next if $seen{"$a.$b.$c.$d"}++;
    # compute value for ipinfo lookup
    my($val)=($a*256+$b)*256+$c;
    # fetch ipinfo data
    # WARNING! for whatever reason the code-beautifier adds an extra space between the < and = below
    # WARNING! that space has to be removed in your code!
    my($r)=$dbh->selectall_arrayref(qq{select country_code,region_code,city,latitude,longitude
        from ip_group_city where ip_start< =$val order by ip_start desc limit 1});
    # no information? no placemark!
    next if !defined($r);
    print qq{\t\t<Placemark>\n};
    print qq{\t\t\t<name>$a.$b.$c.$d</name>\n};
    print qq{\t\t\t<description>\n< ![CDATA[\n};
    print qq{<b>$method $uri from $r->[0]->[0]/$r->[0]->[1]/$r->[0]->[2] at <i>$date</i><br />\n};
    print qq{]]>\n\t\t\t</description>\n};
    print qq{\t\t\t<point>\n};
    print qq{\t\t\t\t<coordinates>$r->[0]->[4],$r->[0]->[3]</coordinates>\n};
    print qq{\t\t\t</point>\n};
    print qq{\t\t\n};
}
# kml trailer
print qq{\t</document>
</kml>
};
$dbh->disconnect();

And you run the above script via:

perl log2kml.pl < access_log > output.kml

Here’s a little sample file from my web-server. Each ip-address is only recorded once, so if you have the same person visit your site several times, only the first hit will be shown and subsequent ones are ignored: log2kml.kmz (51K – click to open it in Google Earth)

Next up is a version that does live-tracking: as the web server is hit, Google Earth will automatically rotate to the location associated with the ip-address :-)