I have a Questions model, and just like StackOverflow, each question can be tagged with multiple descriptive tags by a user.
What I’m trying to decide is whether it’s necessary for the Tags associated with a question to be stored in a separate table in the database.
Or could I store the Tags as a single field of the Questions table as a list of space-separated strings?
I’m not sure which makes more sense – is there any good reason to separate the data?
Using a comma-separated string for a multi-valued attribute is another SQL Antipattern. 🙂
How long does the string need to be? Stated another way: how many tags can a given entry have? (It depends on how long the individual tags are.)
How do you account for strings that contain the separator character? What if a character you currently use as a separator becomes a legitimate character in a tag?
How do you insert or delete elements from the list in SQL? (You have to fetch the whole list into the application, explode the list, filter through it, and re-post it to the database.)
How can you do aggregates like
COUNT(*)in SQL?How do you search efficiently for all entries that share a given tag? (You have to use costly pattern-matching queries.)
The solution is to use a separate table, as most other folks on this thread are advising.