I was given to understand that calling print obj would call obj.__str__() which would in turn return a string to print to the console. Now I head a problem with Unicode where I could not print any non-ascii characters. I got the typical “ascii out of range” stuff.
While experimenting the following worked:
print obj.__str__()
print obj.__repr__()
With both functions doing exactly the same (__str__() just returns self.__repr__()). What did not work:
print obj
The problem occured only with using a character out of ascii range. The final solution was to to the following in __str__():
return self.__repr__().encode(sys.stdout.encoding)
Now it works for all parts. My question now is: Where is the difference? Why does it work now? I get if nothing worked, why this works now. But why does only the top part work, not the bottom.
OS is Windows 7 x64 with a default Windows command prompt. Also the encoding is reported to be cp850. This is more of a general question to understand python. My problem is already solved, but I am not 100% happy, mostly because now calling str(obj) will yield a string that is not encoded in the way I wanted it.
# -*- coding: utf-8 -*-
class Sample(object):
def __init__(self):
self.name = u"üé"
def __repr__(self):
return self.name
def __str__(self):
return self.name
obj = Sample()
print obj.__str__(), obj.__repr__(), obj
Remove the last obj and it works. Keep it and it crashes with
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
My guess is that print does something like the following for an object
objit’s meant to print:objis aunicode. If so, encodes it tosys.stdout.encodingand prints.objis astr. If so, prints it directly.objis anything else, callsstr(obj)and prints that.Step 1. is why
print obj.__str__()works in your case.Now, what
str(obj)does is:obj.__str__().str, return itunicode, encodes it to"ascii"and return thatCalling
obj.__str__()directly skips steps 2-3, which is why you don’t get the encoding failure.The problem isn’t caused by how
printworks, it’s caused by howstr()works.str()ignoressys.stdout.encoding. Since it doesn’t know what you want to do with the resulting string, the default encoding it uses can be considered arbitrary;asciiis as good or bad a choice as any.To prevent this bug, make sure you return a
strfrom__str__()as the documentation tells you to do. A pattern you could use for Python 2.x might be:(If you’re sure you don’t need the
str()representation for anything but printing to the console.)