I have a SQL query that I execute like this with an SQLAlchemy engine:
result = engine.execute('SELECT utf_8_field FROM table')
The database is MySQL and the column type is TEXT with UTF-8 encoding. The type of the returned utf_8_field is “str”, even if I set the option convert_unicode=True when creating the engine. What happens now is that if I have a character like ‘é’ in my string (which is not in 7-bit ASCII, but is in the extended ASCII set), I get a UnicodeDecodeError when trying to execute this:
utf_8_field.encode("utf-8")
The exact error is:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 1: ordinal not in range(128)
When looking into this, I found that str.encode do not support the extended ASCII character set! I find this really strange, but that’s another question.
What I don’t understand is why SQLAlchemy is not giving me a unicode string. I was previously using DB-API and that was working fine. I also don’t have SQLAlchemy table objects for my tables yet, that’s why I’m using an execute command.
Any idea?
If you want the data converted automatically, you should specify the charset when you create the engine:
Setting
use_unicodealone won’t tell sqlalchemy which charset to use.