Under what circumstances would you recommend using UTF-8? Is there an alternative to it that will serve the same purpose?
UTF-8 is being used for i18n?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Since you tagged this with web design, I assume you need to optimize the code size to be as small as possible to transfer files quickly.
The alternatives to UTF-8 would be the other Unicode encodings, since there is no alternative to using Unicode (for regular computer systems at least).
If you look at how UTF-8 is specified, you’ll see that all code points up to U+007F will require one octet, and code points up to U+07FF will require two octets, up to U+FFFF three and four octets for code points up to U+10FFFF.
For UTF-16, you will need two octets up to U+FFFF (mostly), and four octets for values up to U+10FFFF.
For UTF-32, you need four octets for all unicode points.
In other words, scripts that lie under U+07FF will have some size benefit from using UTF-8 compared to UTF-16, while scripts above that will have some size penalty.
However, since the domain is web design, it might be worth noting that all control characters lie in the one-octet range of UTF-8, which makes this less true for texts with lots of, say, HTML markup and Javascript, compared to the amount of actual “text”.
Scripts under U+07FF include Latin (except some extensions such as tone marks), Greek, Cyrillic, Hebrew and probably some more. Wikipedia has pretty good coverage on Unicode issues, and on the Unicode Consortium you can get even more details.