What is the collation usage for a database? Well for HTML UTF-8 I know a bit, like for displaying other language type. But what about for a database? I’m using latin-1 (default), my friends told me to use UTF instead. When I ask why, they don’t know and say that others use it. So I’m questioning what does collation really do? Does it affect speed or something like that?
Share
MySQL confuses the issue by having collations named after character encodings. They’re separate concepts.
A collation determines how the relational operators (
<,>, etc.) andORDER BYclauses sort strings. Issues considered by collations are:Some of these depend on the language.
A character encoding determines how text values get converted to and from byte sequences. For a good introduction, see The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).
There are hundreds of different character encodings, most of the specific to a certain combination of operating system and locale. Most of them are supersets of US-ASCII, so if you’re damn sure your data will be ASCII-only, it doesn’t matter what encoding you use.
But if you need other characters, you need an encoding that can handle them. For Western languages, your choices are generally:
The difference between the two is: