Possible Duplicate:
Division/RegExp conflict while tokenizing Javascript
I’m writing a JS lexer for fun and there’s just one piece that’s missing: the part that can chew in regexes.
Take for instance the following valid JS piece of code: /ab+c/;
How can a JS lexer know whether it’s dealing with a regex or with
[Operator('/'), Identifier('ab'), Operator('+'), Identifier('c'), Operator('/'), Semicolon] ?
You would need to implement a Lexical grammar which included parsing regex. According to ECMA Script documenation, “A RegExp grammar for ECMAScript is given in
15.10“:See also: ECMAScript Lexical Conventions