I recently received an assignment in my Java programming class to analyze a (what I would guess is a very small) dataset. I really enjoyed the assignment and the use of a ‘tokenizer’ etc which was a new concept to me. The dataset we got to work with was pretty boring, as it only contained dates.
What I’m looking for is:
Public datasets (XML, txt or similar) to practice analysis on
This can be anything really (preferably pretty simply), as I’m mainly trying to print out statistics, patterns and graphs.
Try the Stackoverflow data dump.