Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6387703
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T03:11:08+00:00 2026-05-25T03:11:08+00:00

During web scraping, I got character \u260e in unicode. My output is "The Last

  • 0

During web scraping, I got character \u260e in unicode. My output is "The Last Resort, ☎ +977 1 4700525". So instead of ☎, there should be ☎.

How can I get it back to telephone sign (☎)? So output will be "The Last Resort, ☎ +977 1 4700525".

Krish

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T03:11:09+00:00Added an answer on May 25, 2026 at 3:11 am

    When you scraped a site, Python recognized a “☎” character and stored it in a string.

    This character has codepoint 260e. When characters are stored, however, they are stored as sequences of one or more bytes. What those bytes are depends on the encoding being used. In your case UTF-8 was probably used.

    The UTF-8 encoding of this character is E2 98 8E (See http://www.fileformat.info/info/unicode/char/260e/index.htm).

    So now you have a byte sequence representing your character. What are you going to do with it? You are going to output it somewhere. But you want to convert this byte string into characters, so you have to specify an encoding. Let’s say you specify the encoding Windows-1252 (see http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT).

    • E2 is â
    • 98 is ˜
    • 8E is Ž

    which is what you see. You need to write out your Python string in UTF-8. Or if you are writing to HTML, use DruvPathak’s suggestion of using HTML character entity references, in this case

    ☎
    

    or

    ☎
    

    I suspect what happened is that you did not specify an encoding when you wrote out your string and that Windows-1252 was the default. Or, maybe your browser was set to display Windows-1252 by default.

    An interesting thing about sending data out in HTML is that you can send out a UTF-8 byte stream, set the HTTP content-type to UTF-8 and put meta tags in your HTML document stating that the page is encoded in UTF-8, but if an enduser is using a browser that lets him or her override the encoding sent by the server, there is a chance, I suppose, that the enduser will see the data wrongly.

    If you use character entity references, the browser will always show it properly.

    It may be inconvenient, though, to use these entity references, everywhere. Most people these days don’t manually set their browser to override the encoding sent by the server.

    ADDENDUM

    So let’s say you have a unicode string and you want to produce a regular (non-unicode) string (of type str) containing HTML character entity references. Here is an full example script that illustrates a direct, though not necessarily the most Pythonic way to do it:

    def to_character_entity_reference_string(s):
        return "".join(["&#" + str(ord(c)) + ";" for c in s])
    
    print(to_character_entity_reference_string(u'काठमाण्डु'))
    

    If you run this script, you get the output

    काठमाण्डु
    

    You can put that output into a file and open it a Web browser and you will see काठमाण्डु displayed as expected.

    You can create variations on this base script so that characters with codepoints less than 128 are preserved while everything else becomes a character entity reference. You might also want to explore Python’s encode and decode functions. And once again, the character entity references guard against people manually changing their browser settings to override your encodings, which is of course just fine, but may be considered overkill. End users that mess with these settings can be said to get what they deserve so it is generally accepted to set things up to just encode everything in UTF-8, period. Nevertheless, it is nice to know about character entity references.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've got a web page where I am dynamically creating controls during Page_Load event
Got an exception during migration Web application from WAS 6.1 to WAS 7. This
I have a winforms app, that's locking up during a web service request I've
It seems redundant to have zlib compress a web page during every request. It
During hiring a .NET web developer I give the candidate a coding test. I
During our work as web developer for a meteorological company, we are faced with
During development, I usually test ASP.Net applications using the Web Development Server (sometimes called
I'm developing a web application that is targeted at IE and during testing would
Description: An unhandled exception occurred during the execution of the current web request. Please
I would like to have alternate behavior during a print stylesheet on a web

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.