Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4021856
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 20, 20262026-05-20T10:24:12+00:00 2026-05-20T10:24:12+00:00

I have a unicode string like ‘%C3%A7%C3%B6asd+fjkls%25asd’ and I want to decode this string.

  • 0

I have a unicode string like '%C3%A7%C3%B6asd+fjkls%25asd' and I want to decode this string.
I used urllib.unquote_plus(str) but it works wrong.

  • expected : çöasd+fjkls%asd
  • result : çöasd fjkls%asd

double coded utf-8 characters(%C3%A7 and %C3%B6) are decoded wrong.
My python version is 2.7 under a linux distro.
What is the best way to get expected result?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-20T10:24:12+00:00Added an answer on May 20, 2026 at 10:24 am

    You have 3 or 4 or 5 problems … but repr() and unicodedata.name() are your friends; they unambiguously show you exactly what you have got, without the confusion engendered by people with different console encodings communicating the results of print fubar.

    Summary: either (a) you start with a unicode object and apply the unquote function to that or (b) you start off with a str object and your console encoding is not UTF-8.

    If as you say you start off with a unicode object:

    >>> s0 = u'%C3%A7%C3%B6asd+fjkls%25asd'
    >>> print repr(s0)
    u'%C3%A7%C3%B6asd+fjkls%25asd'
    

    this is an accidental nonsense. If you apply urllibX.unquote_YYYY() to it, you get another nonsense unicode object (u'\xc3\xa7\xc3\xb6asd+fjkls%asd') which would cause your shown symptoms when printed. You should convert your original unicode object to a str object immediately:

    >>> s1 = s0.encode('ascii')
    >>> print repr(s1)
    '%C3%A7%C3%B6asd+fjkls%25asd'
    

    then you should unquote it:

    >>> import urllib2
    >>> s2 = urllib2.unquote(s1)
    >>> print repr(s2)
    '\xc3\xa7\xc3\xb6asd+fjkls%asd'
    

    Looking at the first 4 bytes of that, it’s encoded in UTF-8. If you do print s2, it will look OK if your console is expecting UTF-8, but if it’s expecting ISO-8859-1 (aka latin1) you’ll see your symptomatic rubbish (first char will be A-tilde). Let’s park that thought for a moment and convert it to a Unicode object:

    >>> s3 = s2.decode('utf8')
    >>> print repr(s3)
    u'\xe7\xf6asd+fjkls%asd'
    

    and inspect it to see what we’ve actually got:

    >>> import unicodedata
    >>> for c in s3[:6]:
    ...     print repr(c), unicodedata.name(c)
    ...
    u'\xe7' LATIN SMALL LETTER C WITH CEDILLA
    u'\xf6' LATIN SMALL LETTER O WITH DIAERESIS
    u'a' LATIN SMALL LETTER A
    u's' LATIN SMALL LETTER S
    u'd' LATIN SMALL LETTER D
    u'+' PLUS SIGN
    

    Looks like what you said you expected. Now we come to the question of displaying it on your console. Note: don’t freak out when you see “cp850”; I’m doing this portably and just happen to be doing this in a Command Prompt on Windows.

    >>> import sys
    >>> sys.stdout.encoding
    'cp850'
    >>> print s3
    çöasd+fjkls%asd
    

    Note: the unicode object was explicitly encoded using sys.stdout.encoding. Fortunately all the unicode characters in s3 are representable in that encoding (and cp1252 and latin1).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a question! I want to print an unicode string in php (like
I have a Python Unicode string. I want to make sure it only contains
I got an unicode string from an external server like this: 005400610020007400650020007400ED0020007400FA0020003F0020003A0029 and I
I have a Java String like this: peque\u00f1o. Note that it has an embedded
This may be a silly question but... Say I have a String like 4e59
I have a Unicode string in Python, and I would like to remove all
I am reading unicode character from XML like \u09A8\u09AC\u09AE . I have used <?xml
I am trying to split a Unicode string into words (simplistic), like this: print
For example, if I have a unicode string, I can encode it as an
I have a question, which Unicode encoding to use while encoding .NET string into

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.