I am implementing a function (in Python) that checks for conformance of the string to xsd:anyURI.
According to Schema Central it only makes sense to check for repeated, consecutive and non-consecutive # characters and % followed by something other than hex characters 0-Ff.
So far, I have something like and it seems to be working:
if uri.search('(%[^0-9A-Fa-f]+)|(#.*#+)')
The second expression for multiple ‘#’ signs may be faulty.
If you are aiming for an exclusion regex according to the Schema Central parser requirement, you are almost there. The first half, excluding percent signs not followed by two hexadecimal digits is best solved using a negative look-ahead assertion; the second half is fine, though you can ditch the last repeat indicator without affecting your results:
Compile your regex with case independence (
iflag) and you are good to go.Recommended reading: the Python Standard Library’s chapter on Regular Expression Operation Syntax.