Not sure if Stack Overflow is the right site for it, but since there are many DW developers here…
I’m going to build a data warehouse for a graduation project, and to do so I need a good dataset, and by good I mean bad 🙂 I need a dataset which requires a lot of transformations, is contained in many files (with various or weird formatting if possible). It should also have a lot of columns so a moderately large cube can be built on it. Most of the datasets available on the internet are too simple for this. Can anyone recommend something?
Perhaps you could use US Census Data? There’s lots of different kinds of data available. Maybe focus on a specific state? Your cube could allow roll ups across various political or geographic areas, or by various demographics.
http://www.census.gov/population/www/cen2010/glance/
It doesn’t appear that all the data’s available yet, so you can always use the 2000 census instead.