I have documents like this:
{
"_id" : "someid",
"name" : "somename",
"action" : "do something",
"date" : ISODate("2011-08-19T09:00:00Z")
}
I want to map reduce them into something like this:
{
"_id" : "someid",
"value" : {
"count" : 100,
"name" : "somename",
"action" : "do something",
"date" : ISODate("2011-08-19T09:00:00Z")
"firstEncounteredDate" : ISODate("2011-07-01T08:00:00Z")
}
}
I want to group the map reduced documents by “name”, “action”, and “date”. But every document should has this “firstEncounteredDate” containing the earliest “date” (that is actually grouped by “name” and “action”).
If I group by name, action and date, firstEncounteredDate would always be date, that’s why I’d like to know if there’s any way to get “the earliest date” (grouped by “name”, and “action” from the entire document) while doing map-reduce.
How can I do this in map reduce?
Edit: more detail on firstEncounteredDate (courtesy to @beny23)
Seems like a two-pass map-reduce would fit the bill, somewhat akin to this example: http://cookbook.mongodb.org/patterns/unique_items_map_reduce/
In pass #1, group the original “name”x”action”x”date” documents by just “name” and “action”, collecting the various “date” values into a “dates” array during reduce. Use a ‘finalize’ function to find the minimum of the collected dates.
Untested code:
In pass #2, use the documents generated in pass #1 as input. For each “name”x”action” document, emit a new “name”x”action”x”date” document for each collected date, along with the now determined minimum date common to that “name”x”action” pair. Group by “name”x”action”x”date”, summing up the count for each individual date during reduce.
Equally untested code:
Pass #2 does not do a lot of heavy lifting, obviously — it’s mostly copying each document N times, one for each unique date. We could easily build a map of unique dates to their incidence counts during the reduce step of pass #1. (In fact, if we don’t do this, there’s no real point in having a “count” field in the values from pass #1.) But doing the second pass is a fairly effortless way of generating a full target collection containing the desired documents.