I have several Mongoid models that I’m running mapreduce on, and I’d like to store the unified results in a single daily_stats collection. My map & reduce functions work fine for all 3 models, but even when outputting via collection.mapreduce(map, reduce, {:out => "daily_stats", :raw => true}), the results of subsequent map reduce operations overwrite previous results, even it they don’t have overlapping keys:
{'_id': "2012-06-01", 'values': {photos: 10}}
{'_id': "2012-06-02", 'values': {photos: 10}}
Values for photos get thrown out when a subsequent pass returns:
{'_id': "2012-06-01", 'values': {comments: 1}}
{'_id': "2012-06-02", 'values': {comments: 6}}
I tried merging also with collection.mapreduce(map, reduce, {:out => {:merge => "daily_stats"}, :raw => true}), but that doesn’t seem to work either.
Any ideas?
UPDATE
The map & reduce functions are like this for each model:
Map:
function() {
day = Date.UTC(this.created_at.getFullYear(), this.created_at.getMonth(), this.created_at.getDate());
emit(day, {users: 1});
};
Reduce:
function(key, values) {
var users_added_count = 0;
values.forEach(function(v) {
users_added_count += parseInt(v['users']) || 0;
});
return {users: users_added_count};
}
Here’s some extra info about the resulting schema:
{ "_id" : 1337040000000,
"value" : {
"apartments" : 280,
"price" : 1003653,
"photos" : 83,
"comments" : 0 }
}
If you look at the MongoDB documentation for map reduce (http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Outputoptions), you’ll see that by default, the MR output collection replaces any existing collections with same name. “Merge” adds new data into the old output collection, but overwrites documents with the same key.
It looks like your key is the date? If
and
have the same key, the second document will replace the first when you run MR. You either need to specify a more unique key, or you need to have multiple output collections (perhaps one for photos and one for comments?).