Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7999725
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T15:35:09+00:00 2026-06-04T15:35:09+00:00

I spent the better part of an afternoon trying to patch dictionary objects to

  • 0

I spent the better part of an afternoon trying to patch dictionary objects to be utf-8 encoded in lieu of unicode. I am trying to find the fastest and best performing way to extend a dictionary object and ensure that it’s entries, keys and values are both utf-8.

Here is what I have come up with, it does the job but I’m wondering what improvements could be made.

class UTF8Dict(dict):
    def __init__(self, *args, **kwargs):
        d = dict(*args, **kwargs)
        d = _decode_dict(d)
        super(UTF8Dict,self).__init__(d)
    def __setitem__(self,key,value):
        if isinstance(key,unicode):
            key = key.encode('utf-8')
        if isinstance(value,unicode):
            value = value.encode('utf-8')
        return super(UTF8Dict,self).__setitem__(key,value)

def _decode_list(data):
    rv = []
    for item in data:
        if isinstance(item, unicode):
            item = item.encode('utf-8')
        elif isinstance(item, list):
            item = _decode_list(item)
        elif isinstance(item, dict):
            item = _decode_dict(item)
        rv.append(item)
    return rv

def _decode_dict(data):
    rv = {}
    for key, value in data.iteritems():
        if isinstance(key, unicode):
            key = key.encode('utf-8')
        if isinstance(value, unicode):
            value = value.encode('utf-8')
        elif isinstance(value, list):
            value = _decode_list(value)
        elif isinstance(value, dict):
            value = _decode_dict(value)
        rv[key] = value
    return rv

Suggestions that improve any of the following would be very helpful:

  • Performance
  • Cover more edge-cases
  • Error handling
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T15:35:10+00:00Added an answer on June 4, 2026 at 3:35 pm

    I agree with the comments that say that this may be misguided. That said, here are some holes in your current scheme:

    1. d.setdefault can be used to add unicode objects to your dict:

      >>> d = UTF8Dict()
      >>> d.setdefault(u'x', u'y')
      
    2. d.update can be used to add unicode objects to your dict:

      >>> d = UTF8Dict()
      >>> d.update({u'x': u'y'})
      
    3. the list values contained in a dict could be modified to include unicode objects, using any standard list operations. E.g.:

      >>> d = UTF8Dict(x=[])
      >>> d['x'].append(u'x')
      

    Why do you want to ensure that your data structure contains only utf-8 strings?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've spent the better part of the day trying to find an answer to
I've spent the better part of my day trying to solve this message while
I have spent the better part of today trying to figure this out and
I've spent the better part of an hour looking for a definitive answer to
all. Relative rookie here who has spent the better part of three days just
Spent better part of the day combing through SO for this. Here's the setup
I spent the better part of last month beating my head against the wall
I have spent the better part of the last week configuring and tweaking the
I've spent the better part of the day so far looking into this problem.
I am a complete noob with HashMaps and have spent the better part of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.