I’m building a tagging system and I need to retrieve similar tags, so when a user would punch in “some thing” or “somé thing” or “söme thing” or “some¤thing” etc he would get all the matching rows in the table.
If I were using utf8_general or utf8_unicode on the field, it would be a piece o’ cake. I could just
SELECT * FROM tags WHERE tag LIKE 'some thing'
but alas, I need to use utf8_bin in that table. So, what do I do? I’m not a very big mysql expert. I think I should be using CAST() or CONVERT() but I’m not sure how.
The second part, getting the some-thing, some*thing, some&thing etc, is another issue, but I think I can solve it on my own with Regular Expressions
EDIT: THE SOLUTION
I thought that messing around with all this converting and regexping might not be the best way. Instead, I will use my framework’s methods and generate a URL “name” of given tag and store it on the same db row.
Yes, the convert :-
But I think is no benefits to use utf8_bin
When handling search of tag, you can consider to store