I have a weird problem with MongoDB (2.0.2) map reduce.
So, the story goes like this:
I have Ad model (look for model source extract below) and I need to group up to n ads per category in order to have a nice ordered listing I can later use to do more interesting things.
# encoding: utf-8
class Ad
include Mongoid::Document
cache
include Mongoid::Timestamps
field :title
field :slug, :unique => true
def self.aggregate_latest_active_per_category
map = "function () {
emit( this.category, { id: this._id });
}"
reduce = "function ( key, value ) {
return { ads:v };
}"
self.collection.map_reduce(map, reduce, { :out => "categories"} )
end
All fun and games up until now.
What I expect is to get a result in a form which resembles (mongo shell for db.categories.findOne() ):
{
"_id" : "category_name",
"value" : {
"ads" : [
{
"id" : ObjectId("4f2970e9e815f825a30014ab")
},
{
"id" : ObjectId("4f2970e9e815f825a30014b0")
},
{
"id" : ObjectId("4f2970e9e815f825a30014b6")
},
{
"id" : ObjectId("4f2970e9e815f825a30014b8")
},
{
"id" : ObjectId("4f2970e9e815f825a30014bd")
},
{
"id" : ObjectId("4f2970e9e815f825a30014c1")
},
{
"id" : ObjectId("4f2970e9e815f825a30014ca")
},
// ... and it goes on and on
]
}
}
Actually, it would be even better if I could get value to contain only array but MongoDB complains about not supporting that yet, but, with later use of finalize function, that is not a big problem I want to ask about.
Now, back to problem. What actually happens when I do map reduce is that it spits out something like :
{
"_id" : "category_name",
"value" : {
"ads" : [
{
"ads" : [
{
"ads" : [
{
"ads" : [
{
"ads" : [
{
"id" : ObjectId("4f2970d8e815f825a3000011")
},
{
"id" : ObjectId("4f2970d8e815f825a3000017")
},
{
"id" : ObjectId("4f2970d8e815f825a3000019")
},
{
"id" : ObjectId("4f2970d8e815f825a3000022")
},
// ... on and on and on
… and while I could probably work out a way to use this it just doesn’t look like something I should get.
So, my questions (finally) are:
- Am I doing something wrong and what is it?
- I there something wrong with MongoDB map reduce (I mean besides all the usual things when compared to hadoop)?
Yes, you’re doing it wrong. Inputs and outputs of
mapandreduceshould be uniform. Because they are meant to be executed in parallel, andreducemight be run over partially reduced results. Try these functions:This should produce documents like: