Visual Data ... cool.

So, like every other geek out there that has a website and likes to develop code, it's fun to collect data. The fun part of coming up with a database design is figuring out cool ways to relate data and make nifty little reports. Heck, I finally had a real use for a cross join query not too long ago (read this post).

The thing is, presenting that data in a way that looks cool isn't easy. A "cheesy" way to do basic charting is raw HTML manipulation, by stretching various images to make bar graphs and the like. (One example of this is in the upper left, a thermometer control for a charity auction site James and I developed.) I've done a few apps that use GDI/GDI+ -- the CAPTCHA control on this site (which will likely get replaced soon with Mike's Reverse DOS tool), my imaging controls and WolfClock (both under my development section).

Once I received my free copy of Telerik's Controls Suite, though, I thought it would cool to put some of the charting to use. I'm glad I did -- it's really neat to take the results of the queries that are textual data and apply it graphically. Visually it's so much easier to see the correlations between data. I may eventually move this to the admin area of the site, but for the time being I'm leaving it open. The nice part is: the backend collects the data, some business objects query it, and dataset(s) are passed to the control ... in real time (well, there's some caching, but generally in real time!). The charting control is nice -- but not perfect. The data binding is pretty much a manual process of iterating the data source. But, assuming that isn't a problem and assuming you know your data, it's very easy to put together cool little charts.

Check out these pages:

Website Browser Stats
Traffic Analysis

More to come...

Comments (2) -

James Byrd
James Byrd
9/23/2005 3:14:13 AM #

The graphs look great. Sometimes a line chart really just does a better job of expressing data, and producing one is a toughie with the "cheesy" method Wink

I'm curious as to how you "scrub" your data. Did you just build a scrub list of agents over time yourself, or did you find a good source for that information? I imagine that the sources of log noise change over time, so you'd probably have to just do your own analysis. That would let you jettison the noise the most reliably.

9/23/2005 9:21:27 AM #

Hi James,

Good question.  In short, no, I haven't found a good source for scrubbing the data.  Right now pretty much everything will get logged in the database, but there's a scrubbing routine that runs regularly to examine the data.  

It's pretty much a static analysis.  Any bogus referrers are easily spotted, so those entire sessions are purged.   On top of that, I'll look at a referrer report to see if anyone is spamming -- to date, there's only been 1 or 2 and they've been added to the purge list.  

The next scrub is a user agent scrub that does a scrub against known patterns from spiders/bots/etc.   Like the referrers, a distinct report shows any abnormalities -- if it looks suspicious, it's added to the scrub list.

Finally, there's static analysis on the IPs, filtering all of mine, so I don't influence the traffic.

But -- as you've guessed -- it's all static analysis that requires the occasional babysitting to make sure it's accurate.

Comments are closed

My Apps

Dark Skies Astrophotography Journal Vol 1 Explore The Moon
Mars Explorer Moons of Jupiter Messier Object Explorer
Brew Finder Earthquake Explorer Venus Explorer  

My Worldmap

Month List