Back when I worked at Scholastic, I remember talking to some of the guys on the Grolier Interactive team.  They were running an application that would go through the encyclopedia and verify links and give a comprehensive report ... obviously something like this is needed when you've got a commercial product; it would reflect poorly on the company to have a lot of broken links in the shipping product.

These days, tools like this are pretty common but I decided to write my own.  Really, all I wanted was a way to crawl my site and look for bad or broken links.  I added some simple reporting to show a link tree, response times, external links, and errors.  I haven't put the polish on it yet (meaning, tightening up the code and documenting it) so I won't release the source for at least a few more days.  I also wanted to see if there'd be any early adopters who'd give it a try and see if they have any problems (*nudge* *nudge*).

You can download it here: StructureTooBig SiteSpider (requires .NET Framework 2.0)

The SiteSpider is smart enough not to get caught in circular references -- but a word of warning: if you point this thing at a monster site, it's pretty easy to do the math and see that 5 or 6 levels deep may mean a crawling tens of thousands of links.  There's a bunch of settings that are adjustable, including the Max Crawl Depth.   The current status (total links, current depth, and number of links queued) is displayed in the bottom left of the window.

There's a few design considerations I'm still thinking about -- I'm currently keeping the response text with the link, but may remove this for performance reasons.  I'm also considering multithreading the worker process, but this generally rubs website owners the wrong way. 

This tool can also be used to preload the cache of a website, allowing the site to warm up before being put under load.

More info to come, and the usual "use at your own risk" disclaimer applies.
