I’m reading Jeffrey Friedl’s book Mastering Regular Expressions 3rd Ed. On page 274, Jeffrey asked his readers to investigate why the regex /x([^/]|[^x]/)*x/ matches the string (matched characters marked in bold) “years = days /x divide x//365; /x assume non-leap year x/“.
I deleted the ending x/ from the regex. So the output of regex /x([^/]|[^x]/)* is “/x divide x//365; “. But after I added the x/ back, the output of regex /x([^/]|[^x]/)*x/ is “/x divide x//365; /x assume non-leap year x/”.
Could anybody tell me Perl’s regex engine’s backtracking steps for the ending x/?
Here is my perl scripts for this question.
my $str = "years = days /x divide x//365; /x assume non-leap year x/";
if ($str =~ m{(/x([^/]|[^x]/)*)}) {
print "\$1: '$1'\n"; # output: $1: '/x divide x//365; '
} else {
print "not matched.\n";
}
$str = "years = days /x divide x//365; /x assume non-leap year x/";
if ($str =~ m{(/x([^/]|[^x]/)*x/)}) {
print "\$1: '$1'\n"; # output: $1: '/x divide x//365; /x assume non-leap year x/'
} else {
print "not matched.\n";
}
Here’s the rundown:
So basically it says: start with
/x, then match everything butx/, and close out with ax/.