mike hodnick

Point your browser to www.hodnick.com for Mike's latest content.

Notice:

You are viewing Mike's old, archived site. For new content, navigate to hodnick.com

Latest From Twitter...

The Blog

The demographics project has been revived. In between attending to Eva and working on the basement, I've been digging in to code. I haven't actually gotten in to any WPF stuff yet - I've been focusing on parsing the cartographic files from the US census bureau. In my earlier experimentation with 3D in Avalon, I gave the cartographic data just barely enough attention to make things work. This time around, I want the data to be complete, which means parsing many, many files and having a faster parsing process.

The files break down like this:

  • Two files that describe US state boundaries (about 8 MB total)
  • Two files that describe US counties (about 52 MB total)
  • Two files for each state that describe cities (for Minnesota, this is about 2.5 MB; ~ 125 MB for the entire US)
  • Two files for each state that describe zip codes (for Minnesota, this is about 10 MB; ~ 500 MB for the entire US)

I had a first attempt at a parsing process that ran extremely slow. It parsed the entire state files and a fraction of the county files in about a half hour. Way too slow. I improved the design and now it flies and will parse states and counties in a few minutes. Cities and zip codes are buggy at the moment.

Once I get all the data parsed, I'm estimating there might be around 1,000,000 individual latitude and longitude points. What's interesting is that states with a lot of coastline have the most data points. Alaska has about 7400. I haven't counted/observed any other states, but at a glance it looks like Florida, Maine, Hawaii, and California complete the top 5.

Once the data is in SQL, I'll write some more...

posted on Sunday, January 29, 2006 7:11 PM |