I have a document with the following format:
<scheme attr1="lorem" attr2="ipsum" global-test="text goes here" global-attr2="second text goes here">
</scheme>
I want to use a regular expression to extract all the attributes that match global-(.*).
It can also only match on the “scheme” element, so using a simple regular expression like (global-([^=]*)="([^"]*)")+ is not an option. I tried the following regular expression:
<scheme.*([\s]+global-([^=]*)="([^"]*)")+
But this will only match on “global-attr2”, and will see the other global attributes as part of the .* selector. Making the * selector on .* lazy also doesn’t seem to help.
And I know that getting data from an XML document with regular expressions isn’t a good practice, but this script is for a preprocessor. It modifies the XML before parsing it.
A preg_match_all will match everything and store everything as well. So first match against “<scheme”, and if it matches, then run
preg_match_allMatch against something likeand then extract from
matches[0],matches[1], etc.