I read python PEP100 today. In the part of ‘Unicode Default Encoding’, It refer that ‘The Unicode implementation has to make some assumption about the
encoding of 8-bit strings passed to it for coercion and about the
encoding to as default for conversion of Unicode to strings when
no specific encoding is given.’
My question is, What does ‘8-bit strings’ means? Does it mean ASCII?
No, ASCII is a 7-bit encoding. Most text encodings (including UTF-8 and ISO-8859) are 8-bit encodings.
Generally speaking, anything beyond the basic ASCII character set needs more than 7 bits to encode. So when dealing with international data, you usually deal with encodings that can use multiple bytes per encoded character. Python will automatically try to decode byte strings to Unicode when you try to combine Unicode and byte string types, and the default encoding (in python 2) is ASCII. This is a frequent source of UnicodeDecodeError exceptions in Python.
You really want to read up on Unicode and text encodings before you proceed though. I can recommend: