I would like to use regex to look inside a string for the smallest sequence that fits a starting and ending delimiter (taking an escape character into account). For example, if I had the following string I would want to locate the lowest matches [ two ] and [ four \[ five \] ] while ignoring the match they are contained in [ one ... three ... six].
zero [ one [ two ] three [ four \[ five \] ] six ] seven
So far I have the following regex which uses negative look-behinds to check and isn’t quite caching the last ] in the second match.
(\[)(?:(?!(?:[^\\])\1|\]).)*]
My goal is to have a simple parser I can use to process simple nested command blocks.
The following works:
See it working: http://www.rubular.com/r/cAajtm2wxw
Explanation:
Note that this isn’t quite safe because in a string like
[ one \\[ two ] three ]the backslash is escaped, so the backslash before the[should not escape it.To fix this you could use the following:
This changes the single escaped backslash
\\in the original regex to the following regex which checks for an odd number of backslashes:http://www.rubular.com/r/BhQzLQpyB9