Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 922853
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T19:05:55+00:00 2026-05-15T19:05:55+00:00

I have a PHP file which produces an Xml sitemap based on data which

  • 0

I have a PHP file which produces an Xml sitemap based on data which has been imported from a number of sources. My sitemap is currently not well formed due to an illegal character in one line of the imported data however I am struggling to remove it.

The character looks to represent the ‘squared’ or superscript 2, and is represented as a square. I have tried pasting this into a hex editor however it is shown as a ?, and the hex code also corresponds to ?. I have also tried using iconv to convert from all source encodings to all destination encodings, with no combination removing this character.

I also have the following function to remove non-ascii characters:

function stripInvalidXml($value)
{
    $ret = "";
    $current;
    if (empty($value)) 
    {
        return $ret;
    }

    $length = strlen($value);
    for ($i=0; $i < $length; $i++)
    {
        $current = ord($value{$i});
        if (($current == 0x9) ||
            ($current == 0xA) ||
            ($current == 0xD) ||
            (($current >= 0x20) && ($current <= 0xD7FF)) ||
            (($current >= 0xE000) && ($current <= 0xFFFD)) ||
            (($current >= 0x10000) && ($current <= 0x10FFFF)))
        {
            if($current != 0x1F)
            {
                $ret .= chr($current);
            }
        }
        else
        {
            $ret .= " ";
        }
    }


    return $ret;
}

However this still is not removing it. If I step through the code the illegal character is expanded out to ￿ in eclipses debug window. The string it is having issues with is below (hoping it pastes correctly)

251gm-50

Any ideas on a function which will remove this character and prevent this form occurring are much appreciated – I have little control over the data that is imported so it needs to be done at the point of Xml generation.

EDIT

After posting I can see that the character doesn’t appear correctly. When viewing in Eclipses window it appears as & # 65535 ; (without spaces – if I leave spaces in it renders the character, which looks like ￿)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T19:05:56+00:00Added an answer on May 15, 2026 at 7:05 pm

    I think I was looking down the wrong path – rather than an encoding issue character was an HTML entity representing the ‘squared’ symbol. As the descriptions in the URL only exist for search enging purposes I can safely remove all htmlentities with the following regex:

    $content = preg_replace("/&#?[a-z0-9]+;/i","",$content);
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a php file which pulls some data from external API's, and I
I have a php file which checks for login and password from users database,
I have a php file which has some variables like $name1, $name2... How can
I have a php file which has a require_once Statement (?) this file is
I have a index.php file which loads (require) 2 different files based on a
I have an index.php file which has 2 included files require(stats.php); require(stats_encry.php); stats.php which
I have a .php file which has several queries in it. I want the
So I am reading from a text file which has product information I have
I have a php file in which a xml file is loaded through simplexml
I have a config.php file which has some constants and methods. I have a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.