I have a set of objects in Mongodb that each have a set of values embedded in them, e.g.:
[1.22, 12.87, 1.24, 1.24, 9.87, 1.24, 87.65] // ... up to about 150 values
Is a map/reduce the best solution for finding the median (average) and mode (most common value) in the embedded arrays? The reason that I ask is that the map and the reduce both have to return the same (structurally) set of values. It looks like in my case I want to take in a set of values (the array) and return a set of two values (median, mode).
If not, what’s the best way to approach this? I want it to run in a rake task, if that’s relevant. It’d be an overnight data crunching kind of thing.
I assume you want to find the mode & median of each document, you can do this with map reduce. In this case you calculate median & mode in the map function and reduce will return the map result untouched
and for this collection
you can get the median by