I have a Python 2 Pyramid web app using SQLAlchemy to talk to a MySQL table, of which all string columns are UTF-8 encoded. When I pull the data to display, I must use .decode("UTF-8") in order for it to show, otherwise I get the natural error of ASCII can not decode.
I have two questions:
-
Is there any other way of working to avoid the need of
.decode("UTF-8")each and every time? -
If I want to push something into the database, and I have a string which is
s = u'str', do I need to do anything to it when it’s to be insterted to a UTF-8 column?
Thank you very much.
For people who might find this message through a google search:
If you encounter an error, sort of:UnicodeDecodeError: ‘ascii’ codec can’t decode byte in
Do use
.encode(..)
If your SQLAlchemy columns are of the Unicode type instead of String, SQLAlchemy will do the character encoding/decoding (in your case to/from
UTF-8) for you.Note that the String column type has a
convert_unicodeparameter which can be set toTrue, but this should only be used for the very rare cases where the database backend doesn’t have native Unicode support.As @MartijnPieters mentioned on his comment, you should be aware of the MySQL Unicode section in the SQLAlchemy documentation. Namely, if you don’t explicitly set the character encoding in the connection to the database with:
(the following is mostly quoted from the SQLAlchemy documentation)
“[…] many MySQL server installations default to a
latin1encoding for client connections, which has the effect of all data being converted intolatin1, even if you haveutf8or another character set configured on your tables and columns. The charset parameter as received by MySQL-Python also has the side-effect of enablinguse_unicode=1.”“Manually configuring
use_unicode=0will cause MySQL-python to return encoded strings:”