Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7081907
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T06:58:47+00:00 2026-05-28T06:58:47+00:00

Consider the next example: >>> s = uбаба >>> s u’\xe1\xe0\xe1\xe0′ >>> print s

  • 0

Consider the next example:

>>> s = u"баба"
>>> s
u'\xe1\xe0\xe1\xe0'
>>> print s
áàáà

I’m using cp1251 encoding within the idle, but it seems like the interpreter actually uses latin1 to create unicode string:

>>> print s.encode('latin1')
баба

Why so? Is there spec for such behavior?

CPython, 2.7.


Edit

The code I was actually looking for is

>>> u'\xe1\xe0\xe1\xe0' == u'\u00e1\u00e0\u00e1\u00e0'
True

Seems like when encoding unicode with latin1 codec, all unicode points less that 256 are simply left as is thus resulting in bytes which I typed in before.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T06:58:48+00:00Added an answer on May 28, 2026 at 6:58 am

    When you type a character such as б into the terminal, you see a б, but what is really inputted is a sequence of bytes.

    Since your terminal encoding is cp1251, typing баба results in the sequence of bytes equal to the unicode баба encoded in cp1251:

    In [219]: "баба".decode('utf-8').encode('cp1251')
    Out[219]: '\xe1\xe0\xe1\xe0'
    

    (Note I use utf-8 above because my terminal encoding is utf-8, not cp1251. For me, "баба".decode('utf-8') is just unicode for баба.)

    Since typing баба results in the sequence of bytes \xe1\xe0\xe1\xe0, when you type u"баба" into the terminal, Python receives u'\xe1\xe0\xe1\xe0' instead. This is why you are seeing

    >>> s
    u'\xe1\xe0\xe1\xe0'
    

    This unicode happens to represent áàáà.

    And when you type

    >>> print s.encode('latin1')
    

    the latin1 encoding converts u'\xe1\xe0\xe1\xe0' to '\xe1\xe0\xe1\xe0'.
    The terminal receives the sequence of bytes '\xe1\xe0\xe1\xe0', and decodes them with cp1251, thus printing баба:

    In [222]: print('\xe1\xe0\xe1\xe0'.decode('cp1251'))
    баба
    

    Try:

    >>> s = "баба"
    

    (without the u) instead. Or,

    >>> s = "баба".decode('cp1251')
    

    to make s unicode. Or, use the verbose but very explicit (and terminal-encoding agnostic):

    >>> s = u'\N{CYRILLIC SMALL LETTER BE}\N{CYRILLIC SMALL LETTER A}\N{CYRILLIC SMALL LETTER BE}\N{CYRILLIC SMALL LETTER A}'
    

    Or the short but less-readily comprehensible

    >>> s = u'\u0431\u0430\u0431\u0430'
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Consider the following example program: next :: Int -> Int next i | 0
Consider the next example: public List<Allergy> GetAllergies(int? ingredientId = null) { var allergies =
Consider next example : #include <iostream> template< int a > void foo(); int main(int
Consider the next example. I have a monad MyM that is just a StateT
Consider next example : #include <iostream> #include <typeinfo> template< int N, typename T >
Over the next few months, we need to consider the best technology/technique for periodically
Consider the following example of a photo album. The first page and last page
Consider the following code (written with Visual Studio 2010 and .NET 4.0) using System;
Consider the following example, where grep is used to search in binary mode: $
Consider an object used to store a collection of items, but that collection may

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.