I have a collection of entities, which represents a tree. Each entity has a property containing an array of attributes.
For example:
{
"_id" : 1,
"parent_id" : null,
"attributes" : [ "A", "B", "C" ]
}
I would like to use MapReduce to generate another collection which is similar to the original collection, but for each item in the collection it not only contains the attributes directly associated with the entity, but also those of its ancestors, all the way up to the root of the hiearchy.
So given the following entities:
{
"_id" : 1,
"parent_id" : null,
"attributes" : [ "A", "B", "C" ]
}
{
"_id" : 2,
"parent_id" : 1,
"attributes" : [ "D", "E", "F" ]
}
{
"_id" : 3,
"parent_id" : 2,
"attributes" : [ "G", "H", "I" ]
}
The result of the MapReduce job would be the following:
{
"_id" : 1,
"attributes" : [ "A", "B", "C" ]
}
{
"_id" : 2,
"attributes" : [ "A", "B", "C", "D", "E", "F" ]
}
{
"_id" : 3,
"attributes" : [ "A", "B", "C", "D", "E", "F", "G", "H", "I" ]
}
I’ve managed produce MapReduce jobs which do simple things like count the attributes for each entity but can’t get my head round how I might deal with a hierarchy. I am open to alternative ways of storing the data but don’t want to store the whole hierarchy in a single document.
Is this kind of thin possible with MapReduce in MongoDB or am I just thinking about the problem in the wrong way?
Ok, so I don’t think this will be very performant/scalable, because you have to recursively find the parent ids from the child nodes. However, it does provide the output you want.
Provides: