Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8293779
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T13:57:05+00:00 2026-06-08T13:57:05+00:00

I often get XML files which have illegal chars like &, <, >, and

  • 0

I often get XML files which have illegal chars like &, <, >, “ and ‘. Because of that, I cannot read them with simple_xml & DOM and validate users’ XML files against my XSD below to do further processing in PHP.

Is there any way of solving this problem?

I’m reading XML file from remote host so it can be between 10KB and 10MB.

Thanks in advance

Note: I’m putting only invalid XML elements below because some reason whole XML file appears as plain text here.

XML

<url>http://www.amazon.co.uk/gp/product/B005MG8O96/ref=olp_product_details?ie=UTF8&me=&seller=</url>
<description>iPhone 4. The "fastest", <b>highest-resolution</b> iPhone.</description>

XSD

<?xml version="1.0" encoding="UTF-8"?>

<xs:element name="store">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="item" minOccurs="1" maxOccurs="unbounded">
                <xs:complexType>
                    <xs:sequence>
                        <xs:element name="title" type="title_type" />
                        <xs:element name="description" type="description_type" />
                        <xs:element name="price" type="xs:decimal" />
                        <xs:element name="url" type="url_type" />
                        <xs:element name="images">
                            <xs:complexType>
                                <xs:sequence>
                                    <xs:element name="image" minOccurs="1" maxOccurs="unbounded">
                                        <xs:complexType>
                                            <xs:attribute name="url" type="url_type" />
                                        </xs:complexType>
                                    </xs:element>
                                </xs:sequence>
                            </xs:complexType>
                        </xs:element>
                    </xs:sequence>
                    <xs:attribute name="id" type="id_type" />
                    <xs:attribute name="available" type="available_type" />
                </xs:complexType>
            </xs:element>
        </xs:sequence>
        <xs:attribute name="id" type="id_type" />
        <xs:attribute name="date" type="xs:date" />
        <xs:attribute name="time" type="xs:time" />
    </xs:complexType>
</xs:element>

<xs:simpleType name="title_type">
    <xs:restriction base="xs:string">
        <xs:minLength value="1" />
        <xs:maxLength value="100" />
    </xs:restriction>
</xs:simpleType>

<xs:simpleType name="description_type">
    <xs:restriction base="xs:string">
        <xs:minLength value="1" />
        <xs:maxLength value="255" />
    </xs:restriction>
</xs:simpleType>

<xs:simpleType name="url_type">
    <xs:restriction base="xs:anyURI">
        <xs:minLength value="10" />
        <xs:maxLength value="2000" />
    </xs:restriction>
</xs:simpleType>

<xs:simpleType name="id_type">
    <xs:restriction base="xs:string">
        <xs:minLength value="1" />
        <xs:maxLength value="100" />
    </xs:restriction>
</xs:simpleType>

<xs:simpleType name="available_type">
    <xs:restriction base="xs:string">
        <xs:enumeration value="Yes" />
        <xs:enumeration value="No" />
    </xs:restriction>
</xs:simpleType>

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T13:57:08+00:00Added an answer on June 8, 2026 at 1:57 pm

    You should get them to send you proper XML as the commenters said. If you are unable to, you can do the following:

    For each element that might contain invalid characters, if the type is xs:string and the element name is unique in your schema do a multiline search for the open and close tags. Between those tags, replace & with &amp;, replace < with &lt; and replace > with &gt;. Single and double-quotes are not metacharacters outside tags so once you do those replacements you should have valid XML. It might not be the XML the sender wanted, but it is the only unambiguous way I can think of to turn their non-XML into valid XML.

    An alternative to the replacements I mentioned would be to always wrap the text content of those string elements in a CDATA section. But really, how hard is it to just require whoever generates these files to do that for you?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have some scripts that get run often, always from within a connected SQLPlus
Background: In our project, we have a bunch of xml files that define tests
I get this error in xml file very often. here is the code in
I have a function, that gets a large XML file, then parses it, and
I have a website and I check the 404 errors I get quite often.
At my place of employment we have a temperamental proxy server which often makes
I often get into situation when I'd like to use template method pattern, but
I have a winform application that uses some referenced web services to get data.
There is an ajax js function which send request & get response and it
I am designing a simple XML file for animations. I often get in a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.