I am trying do string formatting with a unicode variable. For example:
>>> x = u"Some text—with an emdash."
>>> x
u'Some text\u2014with an emdash.'
>>> print(x)
Some text—with an emdash.
>>> s = "{}".format(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 9: ordinal not in range(128)
>>> t = "%s" %x
>>> t
u'Some text\u2014with an emdash.'
>>> print(t)
Some text—with an emdash.
You can see that I have a unicode string and that it prints just fine. The trouble is when I use Python’s new (and improved?) format() function. If I use the old style (using %s) everything works out fine, but when I use {} and the format() function, it fails.
Any ideas of why this is happening? I am using Python 2.7.2.
The new
format()is not as forgiving when you mix ASCII and unicode strings … so try this: