I have a question, I am searching for about an hour now. A given ANTLR-lexer rule consists of 2 (or more) sub-rules. The Lexer now produces separate AST-nodes.
Example:
[...]
variable: '$' CamelCaseIdentifier;
CamelCaseIdentifier: ('a'..'z') Identifier*;
Identifier: ('a'..'z' | 'A' .. 'Z' | '0'..'9')+;
[...]
With the given input of [...]$a[...] the result is ..., $, a, ...
I am looking for a way to tell the lexer, that these rules should not be separated: ..., $a, ...
Could anyone help me out?
Parser rules start with a lowercase letter and lexer rules with an upper case. When you output as an AST, each individual token in a parser rule will become a separate node, so you’ll want to make the
variablerule a lexer rule instead of a parser rule:But if you do it like this, the input
123456will be tokenized as anIdentifier, which is probably not what you want. Besides, theIdentifierrule is better namedAlphaNum. And if you make a fragment rule of it, you make sure the lexer will never produce anyAlphaNumtokens on itself, but will only useAlphaNum‘s for other lexer rules (like yourCamelCaseIdentifierrule). If you also want a rule that matches anIdentifier, do something like this: