What character encoding should I use for a web page containing mostly Arabic text?
Is utf-8 okay?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
UTF-8 can store the full Unicode range, so it’s fine to use for Arabic.
However, if you were wondering what encoding would be most efficient:
All Arabic characters can be encoded using a single UTF-16 code unit (2 bytes), but they may take either 2 or 3 UTF-8 code units (1 byte each), so if you were just encoding Arabic, UTF-16 would be a more space efficient option.
However, you’re not just encoding Arabic – you’re encoding a significant number of characters that can be stored in a single byte in UTF-8, but take two bytes in UTF-16; all the html encoding characters
<,&,>,=and all the html element names.It’s a trade off and, unless you’re dealing with huge documents, it doesn’t matter.