I’ve encountered the following token in a regular expression: [\s\S]*?
If I understand this correctly, the character class means “match a whitespace character or a non-whitespace character”. Therefore, would this not do exactly the same thing as .*?
One possible difference is that usually . does not match newlines. However, this regular expression was written in Ruby and was passed the m modifier meaning that the . does, in fact, match newlines.
Is there any other reason to use [\s\S]*? instead of .*?
In case it helps, the regular expression I am looking at appears inside the sprockets library in the HEADER_PATTERN constant on line 97. The full expression is:
/
\A \s* (
(\/\* ([\s\S]*?) \*\/) |
(\#\#\# ([\s\S]*?) \#\#\#) |
(\/\/ ([^\n]*) \n?)+ |
(\# ([^\n]*) \n?)+
)
/mx
You interpreted the regex correctly.
That seems like a relict from other languages which do not support the m-flag (or s-flag in other implementations).
A reason to use that construct would be to not use the m-flag so you have the possibility to use . without matching newlines but are still able to match everything if need be.