Geographic Reporting, Part 1

I decided to break down the "How to Create a Geography Map of Your Website Visitors" into a few parts, since each is somewhat independent and because I tend to be longwinded. For a background on what I'm talking about here, read this post.  So, let’s get to the first part: gathering visitor data.

This one should sound obvious and you may already be doing this. I’m not talking about resolving the user to a location just yet – that’s part 2. The one big element we need to capture is the IP address of the request. Whether you do this through a log analyzer, an HTTP Module, or ask the user, it doesn't matter (though asking the user won’t be very successful, I’d imagine) – but very likely, the choice will have profound long-term influences on performance and size/manageability of the data.

I’m going to be making an assumption that someone looking to do this is running a small website. My hunch is that the Amazon's and Microsoft's of the world outsource or have teams of people that handle analytics (in fact, I know they do). Very small websites can likely log hits directly to a database, but as the application grows, the amount of data that can be stored obviously decreases. One company I worked for had an analytic engine that stored about 3 months worth of log files (page requests only) in a SQL server. The database was 18 gig, but it worked. Reports and cleanup generally took all weekend to run. The software and hardware required to support high volume sites comes with a hefty price tag.

In addition to the IP address, we'll need to store any data to support the reports – for example, the date and time of the hit to allow time-based reporting. We may want to log the page that was requested, or the user agent. Whether you store just the initial hit that begins the session or each hit depends on your requirements. Obviously the latter offers a lot of additional info – for example, we can then plot visitors with at least a certain number of page views – but this comes at a steep price.

As I said, you may already be doing some basic logging to a database – and if so, you’re all set with part 1.  If not, you'll need to get that data into a relational database -- while I can discuss how I do it, I imagine everyone has their own unique requirements and it's beyond the scope of what I want to talk about here.  In part 2, we'll get to the fun stuff: I’m going to look at how we figure out where the IPs are located, and part 3 will discuss the mapping of the data to an image.
Comments are closed

My Apps

Dark Skies Astrophotography Journal Vol 1 Explore The Moon
Mars Explorer Moons of Jupiter Messier Object Explorer
Brew Finder Earthquake Explorer Venus Explorer  

My Worldmap

Month List