I’m totally new with regular expressions, but am planning to master it soon. For now I need your help to accomplish the following:
I want to find all commas ( , ) that are between literal/quoted ( ” ) strings.
For example:
"bla bla , bla bla"
and also:
","
but not if it’s a argument delimiter, like:
Replace("abc","b","f")
Maybe it’s very simple if you know regexp, but for me not (yet) ;).
You start with something like this
That’s the simplest way to match something which is quoted. You then change that to match your comma pattern.
But you wan’t to capture the comma so you make that a group
And then to be able to find many commas in one string you can use repetition with a non capturing group.
That should work for you, regular expression are somewhat limited and work best when used correctly. The above pattern looks for a quot, and looks for things that aren’t quotes a comma and then goes on looking. It’s based on repetition of the first group, that way it can find may commas in a string (you’ll access ’em in the Captures property of that Group when you do the match).
As long as the string doesn’t contain quotes itself, this will work nicely, but it’s hard to accept escape sequences within quoted strings with regex. That something they aren’t good at handling. So, as long as that fine, go with it.
Now to the issue with the situation when you have this type of string
"a","b". Scanning the string using the regex will match from left to tight and consume characters in that order, if any match is successful it cannot proceed matching it in any other way. The problem here is that a quoted string without quotes are not a successful match (if we make it a match, but ignore it we can work around this).We always attempt our initial derivation first but fallback to a plain quoted string which we just ignore, that way it will skip a head and not consider the middle of the string as a valid match. It’s all about making sure that state machine, that is, the regex can keep track on the opening and closing of quoted values.
That’s your final solution but you have to check that
Group[1]is successful because now the pattern is successful if it finds a quoted string but the capturing groupGroup[1]isn’t.