In this regex
$line = 'this is a regular expression';
$line =~ s/^(\w+)\b(.*)\b(\w+)$/$3 $2 $1/;
print $line;
Why is $2 equal to " is a regular "? My thought process is that (.*) should be greedy and match all characters until the end of the line and therefore $3 would be empty.
That’s not happening, though. The regex matcher is somehow stopping right before the last word boundary and populating $3 with what’s after the last word boundary and the rest of the string is sent to $2.
Any explanation?
Thanks.
$3can’t be empty when using this regex because the corresponding capturing group is(\w+), which must match at least one word character or the whole match will fail.So what happens is
(.*)matches “is a regular expression“,\bmatches the end of the string, and(\w+)fails to match. The regex engine then backtracks to(.*)matching “is a regular "(note the match includes the space),\bmatches the word boundary beforee, and(\w+)matches “expression“.If you change
(\w+)to(\w*)then you will end up with the result you expected, where(.*)consumes the whole string.