I have the following patterns:
private static Regex rgxDefinitionDoMatch = new Regex(@"d:(?<value>(?:(?!c:|d:|p:).)+)", RegexOptions.Compiled);
private static Regex rgxDefinitionDontMatch = new Regex(@"\!d:(?<value>(?:(?!c:|d:|p:).)+)", RegexOptions.Compiled);
private static Regex rgxDefinitionExactDoMatch = new Regex(@"d:(?<value>\""(?:(?!c:|d:|p:).)+)\""", RegexOptions.Compiled);
private static Regex rgxDefinitionExactDontMatch = new Regex(@"\!d:(?<value>\""(?:(?!c:|d:|p:).)+)\""", RegexOptions.Compiled);
Here is an example string to match:
c:matchThis !c:dontMatchThis p:matchThis !p:dontMatchThis d:def !d:defDont d:"def" !d:"defDont"
Now here are some issues:
- When I use rgxDefinitionDontMatch, I get both
!d:defDontandd:"defDont" - When I use rgxDefinitionDoMatch it is even worse… I get
!d:defDont,d:"defDont",
!d:defandd:"def".
For number 2, I have tried different combinations to ignore the exclamation mark on the front of rgxDefinitionDoMatch ^(?!\!) for example, but it then just doesn’t match anything. I’m not sure what to do.
I will also need a way of ignoring quotes for both problems 1. and 2.
Can anyone help? I’ve been trying for some time now.
Is this what you’re looking for?
As I was trying to figure out what you were asking, I finally decided the simplest course was to post my code and get your feedback. I’ll try to refine it as needed, and (of course) explain it. 😀
EDIT: Here’s the separate regexes you asked for in the comments:
Combining them the way I did, it doesn’t matter if the “value” part is quoted or not, it’s still captured–without the quotes, if they’re present. (I thought that’s what you meant by “ignoring quotes”.) What’s interesting about the combined form is how I used the same group name twice in the same regex– something few regex flavors support.
(?<!\S), a negative lookbehind for a non-whitespace character, solves the question you posed in your comment: it insures that every match starts either at the beginning of the string or after a whitespace character. Similarly, the\S+insures that the match continues ends at the end of the string or before the next whitespace character."[^"]+", obviously, matches anything enclosed in quotes, except other quotes. It permits the value to contain whitespace, which I presumed was the reason for the separate regexes. But I mainly wanted to point out that you didn’t need to use backslashes to escape the quotes. In a C# verbatim string, it’s the extra quote that does the escaping:@"""[^""]+""".