Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1059775
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T18:13:30+00:00 2026-05-16T18:13:30+00:00

I’m facing a nasty encoding issue when transforming XML via XSLT through PHP. The

  • 0

I’m facing a nasty encoding issue when transforming XML via XSLT through PHP.

The problem can be summarised/dumbed down as follows: when I copy a (UTF-8 encoded) XHTML file with an XSLT stylesheet, some characters are displayed wrong. When I just show the same XHTML file, all characters come out correctly.

Following files illustrate the problem:

XHTML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <title>encoding test</title>
    </head>
    <body>
        <p>This is how we d&#239;&#223;&#960;&#955;&#509; &#145;special characters&#146;</p>
    </body>
</html>

XSLT

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    <xsl:output method="xml" encoding="UTF-8"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

PHP

<?php
  $xml_file = 'encoding_test.xml';
  $xsl_file = 'encoding_test.xsl';

  $xml_doc = new DOMDocument('1.0', 'utf-8');
  $xml_doc->load($xml_file);

  $xsl_doc = new DOMDocument('1.0', 'utf-8');
  $xsl_doc->load($xsl_file);

  $xp = new XsltProcessor();
  $xp->importStylesheet($xsl_doc);

  // alllow to bypass XSLT transformation with bypass=true request parameter
  if ($bypass = $_GET['bypass']) {
    echo file_get_contents($xml_file);
  }
  else {
    echo $xp->transformToXML($xml_doc);
  }
?>

When this script is invoked as such (via e.g. http://localhost/encoding_test/encoding_test.php), all characters in the transformed XHTML document come out ok, except for the &#145; and &#146; character entities (they’re opening and closing single quotation marks). I’m not a Unicode expert, but two things strike me:

  1. all other character entities are interpreted correctly (which could imply something about the UTF-8-ness of &#145; and &#146;)
  2. yet, when the XHTML file is displayed unmediated (via e.g. http://localhost/encoding_test/encoding_test.php?bypass=true), all characters are displayed properly.

I think I’ve declared UTF-8 encoding for the output anywhere I could. Do others perhaps see what’s wrong and can be righted?

Thanks in advance!

Ron Van den Branden

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T18:13:31+00:00Added an answer on May 16, 2026 at 6:13 pm

    &#145; and &#146; are no visible Unicode characters.

    They are old HTML character references1 for single quotes, but when you process them using an XSLT processor the processor doesn’t see single quotes but the Unicode characters with decimal codes 145 and 146, i.e. U+0090 and U+0091.

    These characters are private use (i.e. the usage is not defined by the Unicode consortium) C1 control codes.

    The solution is to use the correct Unicode characters &#x2018; and &#x2019;.

    1In fact, these are codes that map to Windows-1252 encoding. They are displayed by browsers but they are actually not valid in HTML:

    NOTE — the above SGML declaration, like that of HTML 2.0,
    specifies the character numbers 128 to 159 (80 to 9F hex)
    as UNUSED. This means that numeric character references
    within that range (e.g. ’) are illegal in HTML. Neither ISO 8859-1 nor ISO 10646 contain characters in that
    range, which is reserved for control characters.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
We are using XSLT to translate a RIXML file to XML. Our RIXML contains
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I want to count how many characters a certain string has in PHP, but
I would like to count the length of a string with PHP. The string
I have a jquery bug and I've been looking for hours now, I can't
this is what i have right now Drawing an RSS feed into the php,
I'm using v2.0 of ClassTextile.php, with the following call: $testimonial_text = $textile->TextileRestricted($_POST['testimonial']); ... and
In my XML file chapters tag has more chapter tag.i need to display chapters
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.