Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1838736
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T06:32:31+00:00 2026-05-17T06:32:31+00:00

I want my function to take an argument that could be an unicode object

  • 0

I want my function to take an argument that could be an unicode object or a utf-8 encoded string. Inside my function, I want to convert the argument to unicode. I have something like this:

def myfunction(text):
    if not isinstance(text, unicode):
        text = unicode(text, 'utf-8')

    ...

Is it possible to avoid the use of isinstance? I was looking for something more duck-typing friendly.

During my experiments with decoding, I have run into several weird behaviours of Python. For instance:

>>> u'hello'.decode('utf-8')
u'hello'
>>> u'cer\xf3n'.decode('utf-8')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in po
sition 3: ordinal not in range(128)

Or

>>> u'hello'.decode('utf-8')
u'hello' 12:11
>>> unicode(u'hello', 'utf-8')
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: decoding Unicode is not supported

By the way. I’m using Python 2.6

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T06:32:45+00:00Added an answer on May 17, 2026 at 6:32 am

    You could just try decoding it with the ‘utf-8’ codec, and if that does not work, then return the object.

    def myfunction(text):
        try:
            text = unicode(text, 'utf-8')
        except TypeError:
            return text
    
    print(myfunction(u'cer\xf3n'))
    # cerón
    

    When you take a unicode object and call its decode method with the 'utf-8' codec, Python first tries to convert the unicode object to a string object, and then it calls the string object’s decode(‘utf-8’) method.

    Sometimes the conversion from unicode object to string object fails because Python2 uses the ascii codec by default.

    So, in general, never try to decode unicode objects. Or, if you must try, trap it in a try..except block. There may be a few codecs for which decoding unicode objects works in Python2 (see below), but they have been removed in Python3.

    See this Python bug ticket for an interesting discussion of the issue,
    and also Guido van Rossum’s blog:

    “We are adopting a slightly different
    approach to codecs: while in Python 2,
    codecs can accept either Unicode or
    8-bits as input and produce either as
    output, in Py3k, encoding is always a
    translation from a Unicode (text)
    string to an array of bytes, and
    decoding always goes the opposite
    direction.
    This means that we had to
    drop a few codecs that don’t fit in
    this model, for example rot13, base64
    and bz2 (those conversions are still
    supported, just not through the
    encode/decode API).”

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to convert function object to function. I wrote this code, but it
I want a simple function that receives a string and returns an array of
I want to have a function that will return the reverse of a list
I want to create a function that performs a function passed by parameter on
I want to write a function in Python that returns different fixed values based
I want to write a function that takes an array of letters as an
I want to write a function that accepts two objects as parameters and compare
I want to call a function from a .NET DLL (coded in C#) from
I want to output the function name each time it is called, I can
I want to create a function like a for loop where I will ideally

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.