The demographics project has been revived. In between attending to Eva and working on the basement, I've been digging in to code. I haven't actually gotten in to any WPF stuff yet - I've been focusing on parsing the cartographic files from the US census bureau. In my earlier experimentation with 3D in Avalon, I gave the cartographic data just barely enough attention to make things work. This time around, I want the data to be complete, which means parsing many, many files and having a faster parsing process.
The files break down like this:
- Two files that describe US state boundaries (about 8 MB total)
- Two files that describe US counties (about 52 MB total)
- Two files for each state that describe cities (for Minnesota, this is about 2.5 MB; ~ 125 MB for the entire US)
- Two files for each state that describe zip codes (for Minnesota, this is about 10 MB; ~ 500 MB for the entire US)
I had a first attempt at a parsing process that ran extremely slow. It parsed the entire state files and a fraction of the county files in about a half hour. Way too slow. I improved the design and now it flies and will parse states and counties in a few minutes. Cities and zip codes are buggy at the moment.
Once I get all the data parsed, I'm estimating there might be around 1,000,000 individual latitude and longitude points. What's interesting is that states with a lot of coastline have the most data points. Alaska has about 7400. I haven't counted/observed any other states, but at a glance it looks like Florida, Maine, Hawaii, and California complete the top 5.
Once the data is in SQL, I'll write some more...