Ip2Location (and IPinfoDB) Performance Tips

I’ve done a number of talks lately on Worldmaps and typically in side conversations/emails, people are curious about the databases and converting IP addresses to geographic locations.   And, often when you dive into using the data, it seems there are a number of performance considerations and I thought I’d share my input on these topics. First up, the data.  Worldmaps uses two databases for IP resolution.  The primary/production database is Ip2Location.  I’ve found this database to be very accurate.  For development/demo purposes, I use IPinfoDB.  I haven’t had too much time to play with this database yet, but so far seems accurate also.   The latter is free, whereas Ip2Location is not. In either case, the schema is nearly identical: The BeginIp and EndIp columns are a clustered primary key.  In the case of IPinfoDB, there is no EndIp field (and it’s not really needed).  When performing a resolution, a string IP address is converted into a 64 bit integer and then used in searching the table.  That’s why having a clustered key on the BeginIp (and optionally EndIp) is crucial to performance. But it doesn’t stop there.   The examples posted in the database’s respective home pages are accurate and simple, but need to be refactored for performance.  For example, to do a simple resolution on Ip2Location, according to their docs:
SELECT * FROM dbo.Ip2Location WHERE @IpNum BETWEEN BeginIp and EndIp
And for IPInfoDB:
SELECT TOP 1 * FROM IPInfoDB where BeginIp <= @IpNum ORDER BY BeginIp DESC
Both of these methods are perfectly fine, particularly for use as generic samples.  The second one is on the right track, but it doesn’t work for joins so if you’re querying over a range, you’d need to refactor.  And in the first example, using a BETWEEN operator forces a clustered index scan when joining, killing the performance. If we run the first example across my minified Ip2LocationSmall table, we’ll see something like this (and this is running off of SQL Azure – the perf is pretty great compared to localhost!): We can also look at the time: Ouch!  Now, it doesn’t seem too bad, but imagine doing thousands of these requests per minute, or doing large joins.  The goal then is to provide some hints that will optimize the query, particularly for joins.  Our indexes are correct, so we can rework the query to get rid of the BETWEEN operator – we can sacrifice a little readability and do something like:
SELECT *    FROM (     select         ( select MAX(beginip)           from dbo.Ip2LocationSmall           where BeginIp <= @IpNum         ) as IP_Begin                ) as foo INNER JOIN dbo.Ip2LocationSmall iploc ON iploc.BeginIp = foo.IP_Begin
The result: And the time shows some improvement: But the REAL benefit comes when we need to join.   Suppose I’d like to get a list of the countries for a given map (which is a parameter called MapId):
SELECT DISTINCT(ip.CountryCode) FROM MapHits hits INNER JOIN dbo.Ip2LocationSmall ip     ON hits.IpNum BETWEEN ip.BeginIp AND ip.EndIp WHERE MapId = @MapId
The query returns 95 rows, and executes in 16 seconds: In this case, we can refactor this using the method above to something like:
SELECT     DISTINCT(CountryCode)    FROM (      select IpNum,         ( select MAX(beginip)           from Ip2LocationSmall           where BeginIp <= IpNum         ) as IP_Begin       from dbo.MapHits as hits       where MapId = @MapId   ) as foo INNER JOIN Ip2LocationSmall iploc ON iploc.BeginIp = foo.IP_Begin
Again, not as pretty looking, but boy what a difference: We went from 16,500 milliseconds to 260 – over 60x the performance!  Mike @AngryPets would be proud.   The reason for the perf gain is we were able to eliminate the nested loop, which is (in this case) scanning the entire clustered index for the matching rows. The second benefit is the ability to switch schemas easily between IP2Location and IPinfoDB, and we can additionally lose the EndIp column which trims the size of the table.

My Apps

Dark Skies Astrophotography Journal Vol 1 Explore The Moon
Mars Explorer Moons of Jupiter Messier Object Explorer
Brew Finder Earthquake Explorer Venus Explorer  

My Worldmap

Month List