I cant find “best” solution for very simple problem(or not very)
Have classical set of data: posts that attached to users, comments that attached to post and to user.
Now i can’t decide how to build scheme/classes
On way is to store user_id inside comments and inside.
But what happens when i have 200 comments on page?
Or when i have N posts on page?
I mean it should be 200 additional requests to database to display user info(such as name,avatar)
Another solution is to embed user data into each comment and each post.
But first -> it is huge overhead, second -> model system is getting corrupted(using mongoalchemy), third-> user can change his info(like avatar). And what then? As i understand update operation on huge collections of comments or posts is not simple operation…
What would you suggest? Is 200 requests per page to mongodb is OK(must aim for performance)?
Or may be I am just missing something…
You can avoid the
N+1-problem of hundreds of requests using$in-queries. Consider this:Now you can find the posts comments with an
$inquery, and you can also easily find all comments made by a specific author.Of course, you could also store the comments as an embedded array in post, and perform an
$inquery on the user information when you fetch the comments. That way, you don’t need to de-normalize user names and still don’t need hundreds of queries.If you choose to denormalize the user names, you will have to update all comments ever made by that user when a user changes e.g. his name. On the other hand, if such operations don’t occur very often, it shouldn’t be a big deal. Or maybe it’s even better to store the name the user had when he made the comment, depending your requirements.
A general problem with embedding is that different writers will write to the same object, so you will have to use the atomic modifiers (such as
$push). This is sometimes harder to use with mappers (I don’t know mongoalchemy though), and generally less flexible.