I have a variable that contains a long string that represents an XML document. Within that string, I need to search for every self-closing tag and expand into two matching opening/closing tags. I’m really not sure how to tackle this and would appreciate your advice. At this point, all I know is how to match a self-closing tag via regex: [^<]+?/> Here’s a short example of what I would like to accomplish:
ORIGINAL STRING:
<outer-tag>
<inner-tag-1>
<SELF-CLOSING-TAG-1 foo="bar"/>
<SELF-CLOSING-TAG-2/>
</inner-tag-1>
<inner-tag-2>
<SELF-CLOSING-TAG-3 attr="value"/>
</inner-tag-2>
</outer-tag>
MODIFIED STRING:
<outer-tag>
<inner-tag-1>
<SELF-CLOSING-TAG-1 foo="bar"></SELF-CLOSING-TAG-1>
<SELF-CLOSING-TAG-2></SELF-CLOSING-TAG-2>
</inner-tag-1>
<inner-tag-2>
<SELF-CLOSING-TAG-3 attr="value"></SELF-CLOSING-TAG-3>
</inner-tag-2>
</outer-tag>
I have used the w3 specifications to create a regexp which correctly parses tags in well-formed XML.
First, select the characters which define the start-tag (per specs). Then, match the remaining characters, excluding possibly trailing spaced and
/>. Globally replace the matched substrings by"<" + starttag + remaining + "></" + starttag + ">". See below: