I have rough ideas – like dealing with singular/plural, two or more words/phrases that mean the same thing, misspellings, etc. But I’m not sure of any patterns or rules of thumb for dealing with these, either programatically and automatically or by presenting them to administrators or even users to clean up.
Any thoughts or suggestions?
You should have a policy for the format of the tags (e.g. tags should be singular). Depending on how diverse the tags are, it might be useful not only to auto-complete while you are typing in a tag, but also to suggest similar tags, so that it is easy for people to use the tag system. Additionally, a cleanup process could correct common spelling mistakes and substitue deprecated tags according to a translation table.