I am building a tagging system (think post tagging in a blog) that will use two tables in mysql.
First table will have:
- tag_id (int)
- tag (varchar)
Second table will have:
- tag_id
- post_id (to link them)
When adding a tag, the first thing I want to do is check if the tag already exists in the first table.
How to do that in the most efficient way? Should I just do
SELECT tag_id from tags where tag = 'atag'
If so, what is the best way to index the tag field?
Is it more efficient if I create a third field with a hash of the tag and index and search that?
I expect the number of tags to grow into the hundreds of thousands.
Since
I’d say a normal unique index on tags.tag is the way to go.
Additionally, since a few tags will account for a big part of the tag cloud you might want to consider LRU-caching them in RAM.