Although some major systems like Joomla store tags as comma-separated text in the main article database, normalized system of three tables as article, tags and tag-relationship is preferred (as others like WordPress uses). There are lots of discussions and questions about structure and reading; but I was unable to find the best INSERT command, as we need to insert into three tables. How to quickly run this process through one SQL run? Or we need to first insert article, then each tags, and finally writing the relationships?
Another question is about the uniqueness of the tags. The main advantage of this system is that we only need to store each term only once (then connecting to corresponding articles). Is it practical to use mysql UNIQUE to avoid duplication? Or (as I read somewhere) we need to read the entire list of tags as an array to find any duplication to catch the tag ID and avoid storing the term?
Will the whole process as three individual steps:
- INSERT the article
- INSERT tags with UNIQUE but regardless of their relationship
- Finding each tag ID and make a relationship to the article ID
Am I right? The reason that I asked is that I saw people catch the tags as an array and make a comparison. To me it is very slow, and kills the performance, particularly for UPDATE.
You can only ever insert in one table at a time.
One solution is to use triggers, the other is to use a transaction.
The first can be used with any engine, the latter requires InnoDB or alike engine.
Make sure you put a
UNIQUEindex on the fieldtag.name.1-Using transactions
2-Using a trigger on a blackhole table
Create a table of type
blackholewith the following fields.Add a
AFTER INSERTtrigger to the blackhole table to do the actual storage for you.Now you can just insert the article with tags in a single statement:
Back to your question
Tags are only useful if there are i a limited number of them. If you put a (unique) index on
tag.namelooking for a tag will be very fast, even with 10.000 tags. This is because you are looking for an exact match. And if you are really in a hurry you can always make the tag table amemorytable with ahash indexon thenamefield.I doubt you need to worry about slowness in the tag lookup though.
Just make sure you don’t allow too many tags per article. 5 seems a good start. 10 would be too many.
Links
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
http://dev.mysql.com/doc/refman/5.0/en/blackhole-storage-engine.html