Implement regular expression matching with support for ‘.’ and ‘*’.
‘.’ Matches any single character. ‘*’ Matches zero or more of the preceding element. The matching should cover the entire input string (not partial).
Some examples:
isMatch(“aa”,”a”) → false
isMatch(“aa”,”aa”) → true
isMatch(“aaa”,”aa”) → false
isMatch(“aa”, “a*”) → true
isMatch(“aa”, “.*”) → true
isMatch(“ab”, “.*”) → true
isMatch(“aab”, “c*a*b”) → true
The author gives the following solution, which is really beautiful.
bool isMatch(const char *s, const char *p) {
assert(s && p);
if (*p == '\0') return *s == '\0';
// next char is not '*': must match current character
if (*(p+1) != '*') {
assert(*p != '*');
return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
}
// next char is '*'
while ((*p == *s) || (*p == '.' && *s != '\0')) {
if (isMatch(s, p+2)) return true;
s++;
}
return isMatch(s, p+2);
}
The author also gives some further thoughts:
If you think carefully, you can exploit some cases that the above code runs in exponential complexity.
Could you think of some examples? How would you make the above code
more efficient?I came up one case that takes a long time to get the result while the
length of string s and p are not huge.s[] = “aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa”
p[] =”a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*b”
Can anyone help me verify this answer?
How to think this kind of finding extreme testing questions?
The best why to understand why your case exhibits exponential behavior is to first experiment with the code a bit and then try to glean from it some empirical data and make hypotheses.
First, let’s add some simple “logging”:
Now let’s run a number of experiments, making sure to reset the count before each experiment (remember in real code global variables are to be avoided 🙂 )
You can look at the outputs of each, and look at the number of lines generated for each and ask yourself “how does the number of recursive calls grow as I lengthen my string?” (Classic empirical algorithm analysis!)
I’ve done the
aaacase for you here: http://ideone.com/8t2kSYou can see it took 34 steps. Look at the output; it should give you some insight into the nature of the matching. And do try for strings of increasing length. Happy experimenting.