Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3311436
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T21:52:36+00:00 2026-05-17T21:52:36+00:00

In the middle of an XML document I’m transforming, there is a CDATA node

  • 0

In the middle of an XML document I’m transforming, there is a CDATA node which I know itself is composed of XML. I would like to have that “recursively parsed” as XML so that I can transform it too. Upon searching, I think my question is very similar to Handling node containing inner escaped XML.

That was a year ago: may I just clarify the following:

  1. It says this cannot be done by some XSLT in one go: rather you need a two-phase approach. I have just bought a shiny new book on XSLT 2.0. Is is still the case that there is no XSLT instruction to “re-parse” a string node as XML?
  2. In my case the XML-string node is just one node in the whole. Therefore in Phase #1 I would only be transforming a fragment of the input XML document; the rest needs passing through unchanged to Phase #2. I see several solutions to passing input to output unchanged, but often it seems they “mostly work”, but skip/do not deal with some kind of node inputs. Is there a relaible construct for passing the rest of the input to the output without any changes?
  3. That approach relies on me being able to apply 2 transforms separately. I am limited (existing application) to only being allowed one transform (the XML output is fixed; it is transformed by one XSLT file; the only thing I can do is put whatever I like into that XSLT file, and/or add further XSLT files, but I cannot influence the top-level call to pass the XML through one XSLT file). Is there anything I could put into an XSLT file which could cause the second XSLT transform to be invoked?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T21:52:36+00:00Added an answer on May 17, 2026 at 9:52 pm

    See update at end.

    1. the most important question. It’s possible to do; the question is whether you’d have to write an XML parser manually in XSLT, or use an extension function, or whether there’s a convenient, portable solution. Update: If you can use Saxon’s parse() extension function, that’s by far your best bet. Do you have access to that?

    2. is easy to answer: yes, use the identity transform. This will not preserve all lexical details of the input XML, such as order of attributes, or whether <foo/> is written as <foo></foo>. However it will preserve all details that are supposed to matter to XML processors.

      But this won’t help you if you can’t run 2 stylesheets in a pipeline, right?

    3. Hmm… not robustly. If your output is going to be displayed by a browser, or handled by something else that understands an XML stylesheet processing instruction, you could output one of those, and hope (against the spec’s recommendation!) that serialization and parsing would occur in between this stylesheet and the one you associated on output. But this would be very fragile. I say “against the spec’s recommendation” because here it says

      When this or any other mechanism
      yields a sequence of more than one
      XSLT stylesheet to be applied
      simultaneously to a XML document, then
      the effect should be the same as
      applying a single stylesheet that
      imports each member of the sequence in
      order

      which would imply, without serialization and parsing in between. Not recommended.

    Update: a new comment says that you don’t know in advance which elements will contain CDATA sections. I jumped to the conclusion that this meant you didn’t know which elements would contain unparsed data (since XML processors officially don’t know or care what elements are in CDATA sections, per se). In that case, all bets are off. As you may know, XML processors are not supposed to know which parts of an XML input doc are in CDATA sections. CDATA is just a different way of escaping markup, an alternative to &lt; etc. Once the data is parsed (which is not properly under the XSLT processor’s jurisdiction), you can’t tell how it was initially expressed in markup. A left pointy bracket remains a left pointy bracket whether it’s expressed as <![CDATA[ < ]]> or &lt;. Just as in C, it doesn’t matter whether you specify a character as ‘A’ or 65 or 0x41; once the program is compiled, your code won’t be able to tell the difference.

    Therefore, if you don’t have another way of determining which data in your input document needs to be parsed, then none of the above methods will help you: you can’t know where to apply saxon:parse(), nor manual parsing, nor disable-output-escaping with a following XSLT transformation.

    Workarounds:

    • You could guess, e.g. with test="contains(., '&lt;')", which nodes contain unparsed data. (Note this tests for the left pointy bracket, regardless of whether it’s expressed as a character entity, or part of a CDATA section, or any other way.) You’d sometimes get false positives, e.g. if a text node contained the string “year < 2001”. Or you could attempt to parse every text node (very inefficient), and for those that parse successfully as well-formed XML documents, output the tree instead of the text.

    • Or you could preprocess the XML with a non-XML tool (like LexEv), which therefore can “see” the CDATA markup. But you’ve said that you can’t control anything outside the single XSLT.

    • Or, ideally, you could send the message back up the chain that the XML you’re being given is unworkable: they need to flag somehow, other than by using CDATA markup, which sections contain unparsed data. Usually this would be done either by specifying certain element names, or by using attribute flags. Obviously this would depend on who’s supplying the XML.

    Another update
    OK, now I understand: so you know which element contains unparsed data (and you know it’s marked up with CDATA), but you don’t know which other data might be marked up with CDATA.

    the idea was to transform [i.e. parse -Lars] the known
    CDATA node (“fred”) into XML nodes
    while leaving the whole of the rest
    of the document as original input
    ,
    so that it could then be piped through
    the “general” transformation

    For this purpose, “leaving the whole of the rest of the document as original input” does not need to mean preserving any CDATA markup. (The general transformation downstream will not know or care what data is CDATA-escaped.) All that is required is that the one unparsed node get parsed, and the rest, not get parsed. The identity transform will do the latter just fine; you can ignore what that page says about CDATA sections on the output… the downstream XSLT will not know or care. (Unless you have additional (non-XML) requirements for the output that you haven’t told us about.)

    So if you could do a two-stylesheet transform, with serialization and parsing in between (i.e. not in a traditional SAX pipeline, for example), then the identity transform would work: you’d just need an additional template for the known unparsed node, with disable-output-escaping, as in Tomalak’s answer here.

    But if you can’t do a two-step transform… what XSLT processor are you using? There may be other avenues specific to it.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Imagine I have the folling XML file: <a>before<b>middle</b>after</a> I want to convert it into
In the middle of a Perl script, there is a system command I want
I have inherited a middle tier system with some multi-Threading issues. Two different threads,
I am in the middle of solving a problem which requires me to do
We have a DLL used as the middle layer between our website front end
We're parsing an XML document using JAXB and get this error: [org.xml.sax.SAXParseException: Invalid byte
I am trying to parse an XMl document that i received into a string
I have an XML with more than 10,000 lines,when i am parsing that xml
I am in the middle of making a script for doing translation of xml
I have the following method that I use to serialize various objects to XML.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.