I want to dvp a small web app which would ideally be used worldwide. For the sake of the discussion, let’s say it’s a recipe sharing site – it’s a good enough metaphor.
My app will allow users to enter or upload text in their native languages. My html header says that the site uses utf-8 encoding. I am now creating my MySQL db, and I suppose that I should select utf8_unicode_ci for the char set & collation.
Is that correct?
Is that all I need to do to be able to receive, store, and display safe user-generated-content in their chosen language? If not, what am I missing?
(I am aware of the safety concerns associated with displaying UGC, this is not what the question is about – here I am solely looking for advice to deal with safe content.)
It is all for you HTML and DB part, but you must ensure that the programming language is UTF-8 aware so it doesn’t garble your stuff. If you use PHP just make sure that the functions you use are UTF-8 aware. If it isn’t the manual usually mentions it.