This is a code example from this tutorial:
http://kylebanker.com/blog/2009/12/mongodb-map-reduce-basics/
He notes that “as of MongoDB v1.8, you must specify an output collection name.” But I don’t see where this is referred to or why it is needed.
# Running map-reduce from Ruby (irb) assuming
# that @comments references the comments collection
# Specify the map and reduce functions in JavaScript, as strings
>> map = "function() { emit(this.author, {votes: this.votes}); }"
>> reduce = "function(key, values) { " +
"var sum = 0; " +
"values.forEach(function(doc) { " +
" sum += doc.votes; " +
"}); " +
"return {votes: sum}; " +
"};"
# Pass those to the map_reduce helper method
@results = @comments.map_reduce(map, reduce, :out => "mr_results")
# Since this method returns an instantiated results collection,
# we just have to query that collection and iterate over the cursor.
>> @results.find().to_a
=> [{"_id" => "hwaet", "value"=>{"votes"=>21.0}},
{"_id" => "kbanker", "value"=>{"votes"=>13.0}}
]
The new Map / Reduce output options are documented here.
The basic premise is that Map / Reduce would originally just output to a temp collection. There were issues around the temp collection, (why do all of that work just to have it be temporary?) and there were some features added around merging and re-reducing.
In particular, you can now run an M/R that effectively updates what output from a previous M/R. (think of updating daily stats once / hour and only processing the last hour).
However, if you only want an in-memory version of the results, you can use the inline option.