I am writing a tiny javascript parser in javascript.
I am at the tokenization level.
I would like to know how to recognize when a regular expression begins and ends.
For example, if I had asked the same question about how to recognize when a string
begins and ends the answer would be:
for a string beginning with double quotes ”
I know that the answer is
string begins with double quotes ”
and ends when the next double quotes ” is encountered (except if preceded by backward-slash \)
any help appreciated
The ECMAScript language specification contains a full grammar for the language (in EBNF) in Annex A. It’s too large to reproduce here in its entirety, but the production for regular expressions is given as “RegularExpressionLiteral”.