Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 852031
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T07:31:34+00:00 2026-05-15T07:31:34+00:00

I have a HTML string in ISO-8859-1 encoding. I need to pass this string

  • 0

I have a HTML string in ISO-8859-1 encoding. I need to pass this string to HTML:Entities::decode_entities() for converting some of the HTML ASCII codes to respective chars. To so i am using a module HTML::Parser::Entities 3.65 but after decode_entities() operation my whole string changes to utf-8 string. This behavior seems fine as the documentation of the HTML::Parse. As i need this string back in ISO-8859-1 format for further processing so i have used Encode::encode(“iso-8859-1”,$str) to change the string back to ISO-8859-1 encoding.
My results are fine excepts for some chars, a question mark is coming instead. One example is single quote ‘ ASCII code (’)

Can anybody help me if there any limitation of Encode module? Any other pointer will also be helpful to solve the problem.
I am pasting the sample text having the char causing the issue:

my $str = "This is a test string to test the encoding of some chars like ’ “ ” etc these are failing to encode; some of them which encode correctly are é « etc.";

Thanks

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T07:31:35+00:00Added an answer on May 15, 2026 at 7:31 am

    The fundamental problem is that the characters represented by ’, “, and ” do not exist in ISO-8859-1. You’ll have to decide what it is that you want to do with them.

    Some possibilities:

    Use cp1252, Microsoft’s “extended” version of ISO-8859-1, instead of the real thing. It does include those characters.

    Re-encode the entities outside the ISO-8859-1 range (plus &), before converting from utf-8 to ISO-8859-1:

    my $toEncode = do { no warnings 'utf8'; "&\x{0100}-\x{10FFFF}" };
    $string = HTML::Entities::encode_entities($string, $toEncode);
    

    (The no warnings bit is needed because U+10FFFF hasn’t actually been assigned yet.)

    There are other possibilities. It really depends on what you’re trying to accomplish.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Background I have a web application that uses ISO-8859-1 encoding. When I pass parameters
I have HTML pages as String in Java and I need to extract the
I have some html extracted to a string var, and want to then use
I have a HTML string containing £ signs, for some reason i'm not able
I'm trying to parse some html and I have some problem with this little
I have the following jsp file <%@ page language=java contentType=text/html; charset=ISO-8859-1 pageEncoding=ISO-8859-1%> <%@ page
I have a HTML string and want to replace all links to just a
I have a big HTML-string containing multiple child-nodes. Is it possible to construct a
I have 145064642 chars long HTML String which I am trying to print. When
I have the following: string html_string = http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=pharma; string html; html = new WebClient().DownloadString(html_string);

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.