Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6167651
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T22:33:01+00:00 2026-05-23T22:33:01+00:00

I am parsing some XML files from a third party provider and unfortunately it’s

  • 0

I am parsing some XML files from a third party provider and unfortunately it’s not always well-formed XML as sometimes some elements contain duplicate attributes.

I don’t have control over the source and I don’t know which elements may have duplicate attributes nor do I know the duplicate attribute names in advance.

Obviously, loading the content into an XMLDocument object raises an XmlException on the duplicate attributes so I though I could use an XmlReader to step though the XML element by element and deal with the duplicate attributes when I get to the offending element.

However, the XmlException is raised on reader.Read() – before I get a chance to insepct the element’s attributes.

Here’s a sample method to demonstrate the issue:

public static void ParseTest()
{
    const string xmlString = 
        @"<?xml version='1.0'?>
        <!-- This is a sample XML document -->
        <Items dupattr=""10"" id=""20"" dupattr=""33"">
            <Item>test with a child element <more/> stuff</Item>
        </Items>";

    var output = new StringBuilder();
    using (XmlReader reader = XmlReader.Create(new StringReader(xmlString)))
    {
        XmlWriterSettings ws = new XmlWriterSettings();
        ws.Indent = true;
        using (XmlWriter writer = XmlWriter.Create(output, ws))
        {
            while (reader.Read())   /* Exception throw here when Items element encountered */
            {
                switch (reader.NodeType)
                {
                    case XmlNodeType.Element:
                        writer.WriteStartElement(reader.Name);
                        if (reader.HasAttributes){ /* CopyNonDuplicateAttributes(); */}
                        break;
                    case XmlNodeType.Text:
                        writer.WriteString(reader.Value);
                        break;
                    case XmlNodeType.XmlDeclaration:
                    case XmlNodeType.ProcessingInstruction:
                        writer.WriteProcessingInstruction(reader.Name, reader.Value);
                        break;
                    case XmlNodeType.Comment:
                        writer.WriteComment(reader.Value);
                        break;
                    case XmlNodeType.EndElement:
                        writer.WriteFullEndElement();
                        break;
                }
            }

        }
    }
    string str = output.ToString();
}

Is there another way to parse the input and remove the duplicate attributes without having to use regular expressions and string manipulation?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T22:33:02+00:00Added an answer on May 23, 2026 at 10:33 pm

    I found a solution by thinking of the XML as an HTML document. Then using the open-source Html Agility Pack library, I was able to get valid XML.

    The trick was to save the xml with a HTML header first.
    So replace the XML declaration
    <?xml version="1.0" encoding="utf-8" ?>
    with an HTML declaration like this:
    !DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

    Once the contents are saved to file, this method will return a valid XML Document.

    // Requires reference to HtmlAgilityPack
    public XmlDocument LoadHtmlAsXml(string url)
    {
        var web = new HtmlWeb();
    
        var m = new MemoryStream();
        var xtw = new XmlTextWriter(m, null);
    
        // Load the content into the writer
        web.LoadHtmlAsXml(url, xtw);
    
        // Rewind the memory stream
        m.Position = 0;
    
        // Create, fill, and return the xml document
        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.LoadXml((new StreamReader(m)).ReadToEnd());
        return xmlDoc;
    }
    

    The duplicate attribute nodes are automatically removed with the later attribute values overwriting the earlier ones.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Following on from my recent question regarding parsing XML files in Java I have
I'm working on some code that deals with parsing files (mainly XML, but there
I'm parsing big XML file with XPathExpression selection for some nodes existing at various
I am parsing some XML something like this: <root> <some_gunk/> <dupe_node> ... stuff I
I'm having some trouble parsing malformed XML in PHP. In particular I'm querying a
I am having some problems parsing this piece of XML using SimpleXML. There is
I'm having difficulty parsing some JSON data returned from my server using jQuery.ajax() To
I've been given the arduous task of parsing some incoming UDP packets from a
I've been parsing through some log files and I've found that some of the
I'm developing an iPhone application that will access XML files (or something similar) from

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.