Greetings.
I’ve been tasked with debugging part of an application that involves a Regex — but, I have never dealt with Regex before. Two questions:
1) I know that the regexes are supposed to be testing whether or not two strings are equivalent, but what specifically do the two regex statements, below, mean in plain English?
2) Does anyone have a recommendation on websites / sources where I can learn more about Regexes? (preferably in C#)
if (Regex.IsMatch(testString, @"^(\s*?)(" + tag + @")(\s*?),", RegexOptions.IgnoreCase))
{
result = true;
}
else if (Regex.IsMatch(testString, @",(\s*?)(" + tag + @")(\s*?),", RegexOptions.IgnoreCase))
{
result = true;
}
It’s going to be difficult to tell what that regex means, without knowing what’s in
tag. In fact, it looks like that regex is broken (or, at least, doesn’t properly escape inputs).Roughly speaking, for the first regex:
^says to match at the beginning of the string.(...)sets up a capturing group (which is available, although this example apparently doesn’t use it).\smatches any white space characters (spaces, tabs, etc.)*?matches zero or more of the previous character (in this case, whitespace), and because it has a question-mark, it matches the minimum number of characters needed to make the rest of the expression work.(" + tag + @")inserts the contents of thetaginto the regex. As I mention, that’s dangerous, without escaping.(\s*?)matches the same as the before (the minimum number of whitespace characters),matches a trailing comma.The second regex is very similar, but looks for a starting comma (rather than the beginning of the string).
I like the Python documentation for Regular Expressions, but it looks like this site
has a pretty good, basic introduction, with C# examples.