User inserts a string in a html form input on browser. This string is saved in database. How this string is encoded and decoded at each stage based on character encoding?
Flow as per technology stack used: browser –> ajax post –> spring mvc –>hibernate –>mysql db
You can expect that the browser post is an URL encoded UTF-8. Within the Java JVM, the string uses UTF-16, therefore roughly doubling the size taken if it is English text. Hibernate is part of that and it does not really care about the encoding, although it does pass around with connection strings as described next (
hibernate.connection.urlproperty).The UTF-16 string is then translated by the JDBC driver which, in case of MySQL, will use the
characterEncodingproperty inside the connection string. It helps if this matches the encoding of the database declared inCREATE DATABASEstatement, avoiding another re-encoding.Finally, “latin” is not a name of a specific character set or encoding. You probably mean ISO 8859-1, also known as Latin-1. This is not a good choice for a web server as it will not be able to represent most non-English strings. You should use UTF-8 in the database and in the connection string, ending up with UTF-8 -> UTF-16 -> UTF-8 which is a safe and reasonably efficient sequence (not counting any encoding that might be happening in the browser itself).
If you decide to alter the database to use UTF-8, be careful about changing the encoding at table level, too. Each table may use its own encoding and it does not change automatically.