I have a collection containing data similiar to this:
{
dimension1:a,
dimension2:b,
dimension3:c,
dimension4:d,
dimension5:e,
value: x
}
there is a finite number of values that a,b,c,d,e can have. Therefore it is possible to see two rows with identical dimensions and different stored values, like this:
{ dimension1:1, dimension2:1, dimension3:1, dimension4:1, dimension5:1, value: 12 }
{ dimension1:1, dimension2:1, dimension3:1, dimension4:1, dimension5:1, value: 34 }
I would like to aggregate objects with matching dimensions and replace them with one object with a sum of values.
I am aware of the fact that I can do it with mapReduce, but is there a way to do it simpler/faster or even assure that my insert statement would add to an existing value if there is one?
[edit]
I also see that db.collection.group() seems designed to do such a thing, but it can’t handle my size of data
I think you want an Upsert With Modifier. This would satisfy your second approach, so that you insert a row if no matching row exists, or just add the value if a matching row does exist.
So you example would be something like:
If you want to insert all the individual values then aggregate them later I would suggest you aggregated them to a separate collection (to avoid confusion). Probably the easiest way to do this would be with a map/reduce rather than a group, as you can simply set the output options of the map/reduce to merge its output into the aggregate collection with options like this:
out : {reduce: "aggregatedcollection"}.