I am dealing with some string manipulation and try to put them into database. Then I encountered this(I believe it’s german):
Sichere Administration von VoIP-Endgeräten
After I put it into database, I realized that the non-English characters became:
Sichere Administration von VoIP-Endger\u00e4ten
and when I fetch it from database and passed this string to subprocess.Popen(), it gives error:
TypeError: execv() arg 2 must contain only strings
My question is: How did this happen? Also does anybody have any useful references about how to learn encoding/decoding stuff? Thanks.
Yes, read the Python Unicode HOWTO; you are dealing with encoded and unicode text.
The first string is UTF-8 data being interpreted as Latin-1, the second string is a unicode string and cannot be passed to
Popen()without encoding first:You’ll need to figure out what encoding your external process can handle and call
.encode()on your data before passing it to.Popen().