I am trying to classify levels of aggregation by finding the most frequently occurring value of a particular field in the documents that are reduced to a given level.
I have documents like this:
{ year: 2012,
month: 01,
category: blue
},
{ year: 2012,
month: 01,
category: blue
},
{ year: 2012,
month: 01,
category: blue
},
{ year: 2012,
month: 01,
category: green
}
The map function basically emit’s these documents back out with keys as [year, month] (though I could include the category if needed). I the reduce to then reduce down to the most frequently occurring category.
In the case of my examples above, group=false, level_1, and level_2 should all reduce to “blue”.
I thought of trying to change the key to [year, month, category] with the hopes that I could count the category values as I moved up the aggregation. But that doesn’t seem to work.
How would I find the most frequently occurring value for category? I feel like the answer is simple, but I’m just not connecting the dots.
Thanks.
It’s simple but not concise as i worked it out.