I’m trying to split the following (Delphi RTTI output) by the namespace delimiter .:
System.Generics.Collections.TEnumerator<Utils.TPair<System.string,System.string>>
The correct split should be [System, Generics, Collections, TEnumerator<Utils.TPair<System.string,System.string>>].
First I tried a negative lookahead \.(?!\<*[a-zA-Z0-9_.,]*\>), but that matched both the period in Utils.TPair and the leftmost System.string. I am a little surprised, I might add, that it (correctly) matched the period in Collections.TEnumerator. I guess this is a testament to my command of the regex language.
So I tried making it “greedier” by saying this: \.(?!\<*[a-zA-Z0-9_.,<>]*\>), but then no match was found. (I know this isn’t what regexers usually mean when they say “greedy”, but I couldn’t come up with a more suitable description.)
So I decided to go back to scratch. As far as I understand, I should be able to use negative lookarounds to solve my case. In particular: any match following < can effectively be ignored. So I decided (?<!\<[a-zA-Z0-9_]*)\. should solve my problem. It doesn’t. Which probably is due to the fact that (many) negative lookbehind implementations don’t support variable length strings. (To be specific, PCRE – which is basically what Delphi uses, apparently support variable length alternates. Every alternate has to be fixed at execution, though.)
And thus I turn to you, the Community.
Can anyone please shed some light on this problem, which actually should be quite simple? Would be great!
Try this regex:
It basically means: Match all
.that are not followed by a.that is before a>und thus inside a<>.It is actually very limited to your example.
See on rubular
A negativ lookbehind can solve this task better:
It means match any
.that does not have a<before it.I tried this in java and it works. Java does not allow to have unlimited lookbehinds, thats why I used a limit of 1000
{0,1000}instead of*. I don’t know if PCRE supports it.