Following this python example, I encode a string as Base64 with:
>>> import base64
>>> encoded = base64.b64encode(b'data to be encoded')
>>> encoded
b'ZGF0YSB0byBiZSBlbmNvZGVk'
But, if I leave out the leading b:
>>> encoded = base64.b64encode('data to be encoded')
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python32\lib\base64.py", line 56, in b64encode
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str
Why is this?
base64 encoding takes 8-bit binary byte data and encodes it uses only the characters
A-Z,a-z,0-9,+,/* so it can be transmitted over channels that do not preserve all 8-bits of data, such as email.Hence, it wants a string of 8-bit bytes. You create those in Python 3 with the
b''syntax.If you remove the
b, it becomes a string. A string is a sequence of Unicode characters. base64 has no idea what to do with Unicode data, it’s not 8-bit. It’s not really any bits, in fact. 🙂In your second example:
All the characters fit neatly into the ASCII character set, and base64 encoding is therefore actually a bit pointless. You can convert it to ascii instead, with
Or simpler:
Which would be the same thing in this case.
* Most base64 flavours may also include a
=at the end as padding. In addition, some base64 variants may use characters other than+and/. See the Variants summary table at Wikipedia for an overview.