Looking for a little guidance in setting up a MongoDB schema. Here’s the scenario:
I’m creating a save bookmark feature for people. In the DB, all I need to store is the username, a title, and a link. From this, I would need to create a service that outputs JSON and queries a particular bookmark or a person’s entire feed. Which of the two set ups makes more sense from both an implementation and performance stand point?
A) Each bookmark is its own object:
{
"_id": ObjectId("abcd1234"),
"username": "Choy",
"title": "This is my first link",
"url": "http://www.google.com"
},
{
"_id": ObjectId("abcd1234"),
"username": "Choy",
"title": "This is my second link",
"url": "http://www.bing.com"
}
B) Each user is its own object:
{
"_id": "Choy",
"bookmarks": {
"abcd1234": {
"title": "This is my first link",
"url": "http://www.google.com"
},
"abcd12345": {
"title": "This is my second link",
"url": "http://www.bing.com"
}
}
}
Initially (A) made more sense to me, as I could easily query a specific bookmark, update, and remove it. But from the application point of view, (B) would be easier when I want to list all the bookmarks for a person as I could just do a findOne(username) on the _id instead of having to iterate through each record after doing a find(username) and convert to an array and then JSON (which I believe is a bit memory intensive).
On the other hand, it would be an extra step in (B) to add a new bookmark, as I would have to get the record, push a new bookmark into it and then save.
When you have a has-a relation in MongoDB, it is usually the best decision to embed the data in the object which owns it.
Your goal is to fulfill the needs of the user with as few searches as possible, because every single document lookup costs time. When you don’t need all the bookmarks from a user but only specific ones, you can always use the dot notation to reach into objects and retrieve subsets of fields.
Aggregation instead of relation is also useful when you delete or rename a user. MongoDB can’t do auto-cascade like SQL databases, so you have to deal with any orphaned data yourself in that case. But when the user document is self-contained, this won’t be a problem.
So I would recommend you to go for solution B).