So I have a checkbox form where users can select multiple values. Then can then go back and select different values. Each value is stored as a row (UserID,value).
How do you do that INSERT when some rows might be duplicates of an already-existing row in the table?
Should I first delete the existing values and then INSERT the new values?
ON DUPLICATE KEY UPDATE seems tricky since I would be INSERTing multiple rows at once, so how would I define and separate just the ones that need UPDATING vs. the ones that need INSERTING?
For example, let’s say a user makes his first-time selection:
INSERT INTO
Choices(UserID,value)
VALUES
('1','banana'),('1','apple'),('1','orange'),('1','cranberry'),('1','lemon')
What if the user goes back later and makes different choices which include SOME of the values in his original query which will thus cause duplicates?
How should I handle that best?
In my opinion, simply deleting the existing choices and then inserting the new ones is the best way to go. It may not be the most efficient overall, but it is simple to code and thus has a much better chance of being correct.
Otherwise it is necessary to find the intersection of the new choices and old choices. Then either delete the obsolete ones or change them to the new choices (and then insert/delete depending on if the new set of choices is bigger or smaller than the original set). The added risk of the extra complexity does not seem worth it.
Edit As @Andrew points out in the comments, deleting the originals en masse may not be a good plan if these records happened to be “parent” records in a referential integrity definition. My thinking was that this seemed like an unlikely situation based on the OP’s description. But it is definitely worth consideration.