from this post What is the most efficient way to store tags in a database?
It was recommended to store tag tables like this.
Table: Item Columns: ItemID, Title, Content Table: Tag Columns: TagID, Title Table: ItemTag Columns: ItemID, TagID
And another SO post said the same thing. Can anyone explain why tags should be stored like this? I am guessing ItemID is some internal val, title is the tag name (c++, sql, noob, etc) content is whatever else data i want to store with my item. why not something like
Table: Item Columns: ItemID, Title, <more data i want> Table: TagList Columns: ItemID, Title
title in item being ‘item name’ and tag title being ‘c++’ ‘sql’ ‘noob’ ‘etc’
There’s nothing wrong with the second design you show, the one with the
TagListtable, except that it takes more space.That is, if you tag 10,000 items with the tag ‘database-design’, then in the two-table design, you have to store that string 10,000 times. If space-efficiency is more important, you could use the three-table design, which would only store the 4-byte integer ID for ‘database-design’ 10,000 times. A savings of 10 * 10,000 bytes.
Another difference is that in the three-table design, you could have more than one row in the
Tagtable with the same string, even though they have different integer ID values. So in theItemTagtable, they would appear to be different tags, and you’d never know that they’re actually tagged similarly. Whereas in the two-table design, tags with the same spelling become grouped together implicitly.Another point: if you have the need to change the spelling of tags, then in the two-table design you have to update many rows. In the three-table design, you only need to update a single row.
And finally, if you commonly need a list of unique tags, it’s more speedy to query the
Tagstable in the three-table design, instead of needing aSELECT DISTINCT tag FROM TagListevery time you need the unique list. And the latter only gives you a list of tags in use, not a list of all eligible tags.