I am playing around with mongodb and I find myself constantly thinking RDBMS and I would need your help to get my head out of this.
So I have a document which I would like to tag. As every documentation/example mentions, I will embed the tags on the document. However my next though would be, where to save the slug (from that tag).
Should in each document instead of saving something like
["tag1", "this is tag 2"]
to save it like:
[{ "slug": "tag1", "tag": "tag1" }, { "slug": "this-is-tag-2", tag: "this is tag 2" }]
Or should I have another collection containing a unique tag to slug matching? (Thus having to query that first before getting all the documents with slug “this-is-a-tag-2”?)
Isn’t saving the slag in the document a waste of space (considering that the relation is constantly the same?) and maybe a performance overhead when querying the collection?
How would you go about it?
It depends on what you are trying to accomplish. In RDBMS, you search for the one true data structure in the n th normal form. There is no such ‘correct’ data structure for a specific set of data in a document store — to find a good data structure, you must ask yourself: what do my queries look like? Will I read much more often than I insert?
For instance, a problematic query with embedding would be: “show the tags that were used, sorted by popularity”, or even worse “show the tags used by my friends sorted by popularity”. To perform the former query quickly, you must keep track of the available tags and the number of references somewhere else. For the latter, you should use an RDBMS.
Tagging is one of the rare cases where I’d usually go for embedding, because typically tags don’t change too often and there’s no need to have some kind of referential integrity (i.e. you can’t change ‘a tag’). But it depends on what you’re tagging and who’s doing it.
I also don’t understand what
slugis good for: if you need something you can search for, you could simply remove special chars and whitespaces and make this a lowercase string on insert?