Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8407877
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T23:29:22+00:00 2026-06-09T23:29:22+00:00

Apparently, I can do that in Python 2.7: value = ‘國華’ It seems Python

  • 0

Apparently, I can do that in Python 2.7:

value = '國華'

It seems Python is using an encoding to encode the characters in the string literal to a byte string. What is that encoding? Is that the encoding defined in sys.getdefaultencoding(), the encoding of the source file, or something else?

Thanks

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T23:29:23+00:00Added an answer on June 9, 2026 at 11:29 pm

    getdefaultencoding has no relation to the encoding of the source file or the terminal. It is the encoding used to convert byte strings implicitly to Unicode strings and should always be ‘ascii’ on Python 2.X (‘utf8’ on Python 3.X).

    On Python 2.X, your line of code in a script with no encoding declared produces an error:

    SyntaxError: Non-ASCII character '\x87' in file ...
    

    The actual non-ASCII character may vary, but it won’t work without an encoding declaration. An encoding declaration is required to use non-ASCII characters on Python 2.X. The encoding declaration must match the source file encoding. For example:

    # coding: utf8
    value = '國華'
    

    when saved as cp936 produces:

    SyntaxError: 'utf8' codec can't decode byte 0x87 in position 9: invalid start byte
    

    When the encoding is correct, the bytes in the byte string are literally what is in the source file, so it will contain the encoded bytes of the characters. When Python parses a Unicode string the bytes are decoded using the declared source encoding to Unicode. Note the difference when printing a UTF-8 byte string and a Unicode string on a cp936 console:

    # coding: utf8
    value = '國華'
    print value,repr(value)
    value = u'國華'
    print value,repr(value)
    

    Output:

    鍦嬭彲 '\xe5\x9c\x8b\xe8\x8f\xaf'
    國華 u'\u570b\u83ef'
    

    The byte string contains the 3-byte UTF-8 encodings of the two characters, but displayed incorrectly since the byte sequence isn’t understood by a cp936 terminal. Unicode is printed correctly, and the string contains the Unicode code points decoded from the UTF-8 bytes of the source file.

    Note the difference when declaring and using the encoding that matches the terminal:

    # coding: cp936
    value = '國華'
    print value,repr(value)
    value = u'國華'
    print value,repr(value)
    

    Output:

    國華 '\x87\xf8\xc8A'
    國華 u'\u570b\u83ef'
    

    The content of the byte string is now the 2-byte cp936 encodings of the two characters (‘A’ equivalent to ‘\x41’) and is displayed correctly since the terminal understands the cp936 byte sequence. The Unicode string contains the same Unicode code points for the two characters as the previous example because the source byte sequence was decoded using the declared source encoding to Unicode.

    If a script has a correct source encoding declaration and uses Unicode strings for text, it will display the correct characters1 regardless of terminal encoding2. It will throw a UnicodeEncodeError if the terminal doesn’t support the character rather than display the wrong character.

    A final note: Python 2.X defaults to ‘ascii’ encoding unless declared otherwise and allows non-ASCII characters in byte strings if the encoding supports them. Python 3.X defaults to ‘utf8’ encoding (so make sure to save in that encoding or declare otherwise), and does not allow non-ASCII characters in byte strings.

    1If the terminal font supports the character.
    2If the terminal encoding supports the character.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Apparently libigraph and python-igraph are the only packages on earth that can't be installed
Sorry, this is my first time using this forum. Apparently people can edit my
You can't have your cake and eat it too, apparently. I'm currently using the
I'm apparently laboring under a poor understanding of Python scoping. Perhaps you can help.
I am writing a Python program that feeds a search term to google using
I'm tearing my hair out. Apparently you can't just do something like class Ranch<T>
When putting a favicon on your site, you can apparently use an animated gif,
Apparently MapReduce queries are one of the slowest things one can do in MongoDB
Apparently, I can't use them. I'm getting an error message like: Invalid use of
I can't pop my stash because I merged a branch which apparently conflicts with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.