I have a collection of Parents that contain EmbeddedThings, and each EmbeddedThing contains a reference to the User that created it.
UserCollection: [
{
_id: ObjectId(…),
name: '…'
},
…
]
ParentCollection: [
{
_id: ObjectId(…),
EmbeddedThings: [
{
_id: 1,
userId: ObjectId(…)
},
{
_id: 2,
userId: ObjectId(…)
}
]
},
…
]
I soon realized that I need to get all EmbeddedThings for a given user, which I managed to accomplish using map/reduce:
"results": [
{
"_id": 1,
"value": [ `EmbeddedThing`, `EmbeddedThing`, … ]
},
{
"_id": 2,
"value": [ `EmbeddedThing`, `EmbeddedThing`, … ]
},
…
]
Is this where I should really just normalize EmbeddedThing into its own collection, or should I still keep map/reduce to accomplish this? Some other design perhaps?
If it helps, this is for users to see their list of EmbeddedThings across all Parents, as opposed to for some reporting/aggregation task (which made me realize I might me doing this wrong).
Thanks!
“To embed or not to embed: that is the question” 🙂
My rules are:
OrderItemwithout anOrderdoesn’t make sense.You should look at your access patterns. If you load
ParentThingseveral thousand times per second, and loadUseronce a week, then map-reduce is probably a good choice. User query will be slow, but it might be ok for your application.Yet another approach is to denormalize even more. That is, when you add an embedded thing, add it to both parent thing and user.