Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8514959
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T05:00:40+00:00 2026-06-11T05:00:40+00:00

How can I correctly parse an XML stylesheet processing instruction? As I understand, the

  • 0

How can I correctly parse an XML stylesheet processing instruction? As I understand, the value of an XML processing instruction such as:

<?xml-stylesheet type="application/xsl" src="style.xsl" version="1.0"?>

is:

type="application/xsl" src="style.xsl" version="1.0"

How can I parse that into a list of key-value pairs? I’ve searched around for some examples of how to do this but haven’t been able to find any.

The key word here is correctly… I don’t want to just write a simple regex that may fail in certain situations, I want to make sure I parse this fully accordant to how you’d properly parse an XML stylesheet instruction.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T05:00:41+00:00Added an answer on June 11, 2026 at 5:00 am

    The grammar of the XML stylesheet PI is given in the spec, so if you want to do it right, it’s simply a matter of writing a parser for that grammar. Since the language is in fact regular, it can be parsed correctly with a regular expression. The biggest complication is likely to be that since the XML spec does not require character references or the predefined entity references to be recognize within a processing instruction, you are likely to be responsible for handling those yourself.

    As to exactly how you should do it, that depends on what environment you’re working in. As an example, here is an XQuery function that does the job and returns a list of elements created from the pseudo-attributes in the processing instruction; if the PI doesn’t match the grammar given in the spec, it returns a single element named error.

    declare function bmt:parse-sspi($s as xs:string) 
      as element()* {
    
      if (bmt:check-sspi($s)) then
         let $s1 := substring-after($s,"<?xml-stylesheet"),
             $s2 := substring-before($s1,"?>")
         return bmt:parse-pseudoatts($s2) 
      else <error/>
    };
    

    This function hands off the real work of parsing the pseudo-attributes to a separate recursive function which parses off one attribute-value pair on each call:

    declare function bmt:parse-pseudoatts($s as xs:string) 
      as element()* {
    
      (: We know that $s is a syntactically legal sequence
         of pseudo-attribute value specifications. So we
         can get by with simpler patterns than we would
         otherwise need.
         :)
    
      let $s1 := replace($s,"^\s+","")
      return if ($s1 = "") then () else
             let $s2 := substring-before($s, '='),
                 $Name := normalize-space($s2),
                 $s3 := substring-after($s, '='),
                 $s4 := replace($s3,"^\s+",""),
                 $Val := if (starts-with($s4,'"')) then
                            substring-before(
                              substring($s4,2),
                              '"')
                         else if (starts-with($s4,"'")) then
                            substring-before(
                              substring($s4,2),
                              "'")
                         else <ERROR/>,
                 $sRest := if (starts-with($s4,'"')) then
                            substring-after(
                              substring($s4,2),
                              '"')
                         else if (starts-with($s4,"'")) then
                            substring-after(
                              substring($s4,2),
                              "'")
                         else ""
    
      return (element {$Name} { $Val }, 
              bmt:parse-pseudoatts($sRest))
    };
    

    As the comments indicate (and as you can see), both of these benefit from knowing in advance that the PI is in fact legal. So we can parse off the pseudo-attribute name by stripping whitespace from whatever precedes the first “=” in the string, and so on.

    The guarantee of correctness is given by a separate check-sspi function, which systematically constructs a regular expression in a way that makes it easy to compare the function with the grammar in the spec, to check that the function is correct.

    declare function bmt:check-sspi($s as xs:string) 
      as xs:boolean {
    
      let $pio := "<\?",
          $kw := "xml-stylesheet",
          $pic := "\?>",
          $S := "\s+",
          $optS := "\s*",
          $Name := "\i\c*",
          $CharRef := "&amp;#[0-9]+;|&amp;#x[0-9a-fA-F]+;",
          $PredefinedEntityRef := concat("&amp;amp;",
                                         "|&amp;lt;",
                                         "|&amp;gt;",
                                         "|&amp;quot;",
                                         "|&amp;apos;"),
          $dq := '"',
          $sq := "'",
          $dqstring := concat($dq,
                              "(",
                              "[^", $dq, "&lt;&amp;]",
                              "|",
                              "$CharRef",
                              "|",
                              "$PredefinedEntityRef",
                              ")*",
                              $dq),
          $sqstring := concat($sq,
                              "(",
                              "[^",$sq,"&lt;&amp;]",
                              "|",
                              "$CharRef",
                              "|",
                              "$PredefinedEntityRef",
                              ")*",
                              $sq),
          $psAttVal := concat("(",$dqstring,"|",$sqstring,")"),
          $pseudoAtt := concat("(", 
                               $Name, 
                               $optS, "=", $optS, 
                               $psAttVal,
                               ")"),
          $sspi := concat($pio,
                          $kw,
                          "(", $S, $pseudoAtt, ")*",
                          $optS,
                          $pic),
          $sspi2 := concat("^", $sspi, "$")
          return if (matches($s,$sspi2)) then true() else false()
    };
    

    For the test string

    <?xml-stylesheet  foo="bar"
          href="http://www.w3.org/2008/09/xsd.xsl"
          type='text/xsl'
    ?>
    

    the top-level parse-sspi function returns

    <foo>bar</foo>
    <href>http://www.w3.org/2008/09/xsd.xsl</href>
    <type>text/xsl</type>
    

    These functions could be somewhat more compact if we just did the parsing with a single Perl-style regular expression. Some people might find such a compact form more natural and easier to follow, some will prefer a less succinct formulation like that given here.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

What's wrong with this xml schema? It doesn't parse correctly, and I can't realize
I am usin the nsxmlparser and am wondering how I can parse ISO-8859-1 correctly
Can anyone tell me how I can correctly pass my application context to my
I'm in the works of learning to parse xml and rss feeds correctly and
I'm trying to parse XML returned from the Youtue API. The APIcalls work correctly
The JSON that I am parsing can be found here 'redacted'. I can correctly
Here's the troublesome function. I can not correctly locate the nested UL element: function
Not sure if I can explain this correctly, but I am trying to execute
I'm a bit worried if this function sends emails that can be recognized correctly
How can i write this correctly ? I want to check how many days

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.