I am having a problem with creating a mapreduce algorithm that will get me the stats i need. I have a user object that can create a post and a post can have many likes by other users.
User
–Post
—-Likes
The Post is not embedded in the user because we access posts separately and not just in a user context. The stat I need is the number of likes an author has gotten and i need to get this through the likes of the posts of a user. The problem is that because the posts are not embedded, I cannot access them in my map function. Here are the map and reduce functions I currently have
def reputation_map
<<-MAP
function() {
var posts = db.posts.find({user_id:this._id});
emit(this._id, {posts:posts});
}
MAP
end
def reputation_reduce
<<-REDUCE
function(key, values) {
var count = 0;
while(values.hasNext()){
values.next();
count+=1;
}
return {posts:count};
}
REDUCE
end
This should only return the posts for each user so I have not even gotten to the likes level yet but instead of a count, this only returns a dbquery for posts. What is the correct way of doing this?
Map Reduce is really designed to operate on a single collection at a time.
Technically, it is possible to query a separate collection from inside a Map function as you have done, but take caution as this is not recommended nor supported. you may run into issues, especially if the collection is sharded.
A similar question was asked a while back: How to call to mongodb inside my map/reduce functions? Is it a good practice?
If you are aggregating results from multiple collections, you may find that the safest and most straight-forward way to do it is in the application.
Alternatively, if likes per author is a value that will be searched for with some frequency, it may be preferable to include it as a value in each document, and spend a little more overhead on each update to increment this value, rather than periodically performing a potentially resource-heavy calculation of all the votes per author.
Hopefully this will give you some food for thought for retrieving the values that you need to.
If you would like some assistance writing a Map Reduce operation for a single collection, the Community is here to help. Please include a sample input document, and a description of the desired output.
For more information on Map Reduce, the documentation may be found here:
http://www.mongodb.org/display/DOCS/MapReduce
Additionally, there are some good Map Reduce examples in the MongoDB Cookbook:
http://cookbook.mongodb.org/
The “Extras” section of the cookbook article “Finding Max And Min Values with Versioned Documents” http://cookbook.mongodb.org/patterns/finding_max_and_min/ contains a good step-by-step walkthrough of a Map Reduce operation, explaining how the functions are executed.