I have a single csv file with data about schools: their locations, their names and a number of indicator values for each:
County_id, County_name, Municipality_id, Municipality_name, School_id,
School_name, Year, Indicator_1, Indicator_2, Indicator_3 […]
I am building an interactive JavaScript visualization around this data and would like to serve it using MongoDB.
Some typical queries would be: (pseudo)
- Get a list of all Schools and their respective locations
- Get all Indicator_1 values for all years for schools in a specific location
- Count the number of schools in a specific location
1. Based on the data given, how should I set up my MongoDB documents/collections to be able to answer queries such as the above?
(Hints on what intermediate steps I should take to import and build the schema are also welcome.)
For aggregate queries, check out MongoDB’s MapReduce. This by default will create new collections based on your map/reduce functions (but can also run in-memory).
You can use mongoimport to import CSV data into the database, then leverage map/reduce to get the data you want.