Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 474087
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T00:15:08+00:00 2026-05-13T00:15:08+00:00

According to this answer: urllib2 read to Unicode I have to get the content-type

  • 0

According to this answer: urllib2 read to Unicode

I have to get the content-type in order to change to Unicode. However, some websites don’t have a "charset".

For example, the [‘content-type’] for this page is "text/html". I can’t convert it to Unicode.

encoding=urlResponse.headers['content-type'].split('charset=')[-1]
htmlSource = unicode(htmlSource, encoding)
TypeError: 'int' object is not callable

Is there a default "encoding" (English, of course)…so that if nothing is found, I can just use that?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T00:15:08+00:00Added an answer on May 13, 2026 at 12:15 am

    Is there a default “encoding” (English, of course)…so that if nothing is found, I can just use that?

    No, there isn’t. You must guess.

    Trivial approach: try and decode as UTF-8. If it works, great, it’s probably UTF-8. If it doesn’t, choose a most-likely encoding for the kinds of pages you’re browsing. For English pages that’s cp1252, the Windows Western European encoding. (Which is like ISO-8859-1; in fact most browsers will use cp1252 instead of iso-8859-1 even if you specify that charset, so it’s worth duplicating that behaviour.)

    If you need to guess other languages, it gets very hairy. There are existing modules to help you guess in these situations. See eg. chardet.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

According to this stack overflow answer , the _t postfix on type names is
According to this website , you can change to command key sequence used by
According to this answer https://stackoverflow.com/questions/551950/what-stackless-programming-languages-are-available/671296#671296 all of these programming languages are stackless Stackless Python
According to the answers to this question, I cannot embed a file version in
According this MSDN article HttpApplication .EndRequest can be used to close or dispose of
According to this discussion , the iphone agreement says that it doesn't allow loading
According to this article Silverlight 2 Beta 2 supports the DataContractJsonSerializer object. But, when
According to this http://perldoc.perl.org/UNIVERSAL.html I shouldn't use UNIVERSAL::isa() and should instead use $obj->isa() or
According to this page, it's possible to use TClientDataset as an in-memory dataset, completely
According to this article rebasing is not necessary for .NET assemblies due to JIT

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.