Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1053381
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T17:14:40+00:00 2026-05-16T17:14:40+00:00

I’m struggling with print and unicode conversion. Here is some code executed in the

  • 0

I’m struggling with print and unicode conversion. Here is some code executed in the 2.5 windows interpreter.

>>> import sys
>>> print sys.stdout.encoding
cp850
>>> print u"é"
é
>>> print u"é".encode("cp850")
é
>>> print u"é".encode("utf8")
├®
>>> print u"é".__repr__()
u'\xe9'

>>> class A():
...    def __unicode__(self):
...       return u"é"
...
>>> print A()
<__main__.A instance at 0x0000000002AEEA88>

>>> class B():
...    def __repr__(self):
...       return u"é".encode("cp850")
...
>>> print B()
é

>>> class C():
...    def __repr__(self):
...       return u"é".encode("utf8")
...
>>> print C()
├®

>>> class D():
...    def __str__(self):
...       return u"é"
...
>>> print D()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

>>> class E():
...    def __repr__(self):
...       return u"é"
...
>>> print E()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 0: ordinal not in range(128)

So, when a unicode string is printed, it’s not it’s __repr__() function which is called and printed.
But when an object is printed __str__() or __repr__() (if __str__ not implemented) is called, not __unicode__(). Both can not return a unicode string.
But why? Why if __repr__() or __str__() return a unicode string, shouldn’t it be the same behavior than when we print a unicode string? I other words: why print D() is different from print D().__str__()

Am I missing something?

These samples also show that if you want to print an object represented with unicode strings, you have to encode it to a object string (type str). But for nice printing (avoid the “├®”), it’s dependent of the sys.stdout encoding.
So, do I have to add u"é".encode(sys.stdout.encoding) for each of my __str__ or __repr__ method? Or return repr(u”é”)?
What if I use piping? Is is the same encoding than sys.stdout?

My main issue is to make a class “printable”, i.e. print A() prints something fully readable (not with the \x*** unicode characters).
Here is the bad behavior/code that needs to be modified:

class User(object):
    name = u"Luiz Inácio Lula da Silva"
    def __repr__(self):
        # returns unicode
        return "<User: %s>" % self.name
        # won't display gracefully
        # expl: print repr(u'é') -> u'\xe9'
        return repr("<User: %s>" % self.name)
        # won't display gracefully
        # expl: print u"é".encode("utf8") -> print '\xc3\xa9' -> ├®
        return ("<User: %s>" % self.name).encode("utf8")

Thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T17:14:41+00:00Added an answer on May 16, 2026 at 5:14 pm

    Python doesn’t have many semantic type constraints on given functions and methods, but it has a few, and here’s one of them: __str__ (in Python 2.*) must return a byte string. As usual, if a unicode object is found where a byte string is required, the current default encoding (usually 'ascii') is applied in the attempt to make the required byte string from the unicode object in question.

    For this operation, the encoding (if any) of any given file object is irrelevant, because what’s being returned from __str__ may be about to be printed, or may be going to be subject to completely different and unrelated treatment. Your purpose in calling __str__ does not matter to the call itself and its results; Python, in general, doesn’t take into account the “future context” of an operation (what you are going to do with the result after the operation is done) in determining the operation’s semantics.

    That’s because Python doesn’t always know your future intentions, and it tries to minimize the amount of surprise. print str(x) and s = str(x); print s (the same operations performed in one gulp vs two), in particular, must have the same effects; if the second case, there will be an exception if str(x) cannot validly produce a byte string (that is, for example, x.__str__() can’t), and therefore the exception should also occur in the other case.

    print itself (since 2.4, I believe), when presented with a unicode object, takes into consideration the .encoding attribute (if any) of the target stream (by default sys.stdout); other operations, as yet unconnected to any given target stream, don’t — and str(x) (i.e. x.__str__()) is just such an operation.

    Hope this helped show the reason for the behavior that is annoying you…

    Edit: the OP now clarifies “My main issue is to make a class “printable”, i.e. print A() prints something fully readable (not with the \x*** unicode characters).”. Here’s the approach I think works best for that specific goal:

    import sys
    
    DEFAULT_ENCODING = 'UTF-8'  # or whatever you like best
    
    class sic(object):
    
        def __unicode__(self):  # the "real thing"
            return u'Pel\xe9'
    
        def __str__(self):      # tries to "look nice"
            return unicode(self).encode(sys.stdout.encoding or DEFAULT_ENCODING,
                                        'replace')
    
        def __repr__(self):     # must be unambiguous
            return repr(unicode(self))
    

    That is, this approach focuses on __unicode__ as the primary way for the class’s instances to format themselves — but since (in Python 2) print calls __str__ instead, it has that one delegate to __unicode__ with the best it can do in terms of encoding. Not perfect, but then Python 2’s print statement is far from perfect anyway;-).

    __repr__, for its part, must strive to be unambiguous, that is, not to “look nice” at the expense of risking ambiguity (ideally, when feasible, it should return a byte string that, if passed to eval, would make an instance equal to the present one… that’s far from always feasible, but the lack of ambiguity is the absolute core of the distinction between __str__ and __repr__, and I strongly recommend respecting that distinction!).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
For some reason, after submitting a string like this Jack’s Spindle from a text
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I have this code to decode numeric html entities to the UTF8 equivalent character.
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I have this code: - (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock { NSString *someString = [[NSString
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti
I have some data like this: 1 2 3 4 5 9 2 6

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.