How can we distinguish a variable name, and an identifer, in an ANTLR grammar?

Question

0

Asked: May 27, 20262026-05-27T05:58:28+00:00 2026-05-27T05:58:28+00:00

How can we distinguish a variable name, and an identifer, in an ANTLR grammar?

0

How can we distinguish a variable name, and an identifer, in an ANTLR grammar?

VAR: ('A'..'Z')+ DIGIT*  ;
IDENT  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'-')*;

The piece of grammar (in ANTLR) does not work because the compiler will complain that IDENT may never be reached for some input. This seems to be a classic head-hack for compiler writers, The lexer hack

For the ANTLR users, Could you tell me your neat way to work around it? Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T05:58:28+00:00

zell wrote:

The piece of grammar (in ANTLR) does not work because the compiler will complain that IDENT may never be reached for some input.

No, that is not correct. The following grammar:

grammar T;

parse
  :  .* EOF
  ;

VAR   : ('A'..'Z')+ DIGIT*  ;
IDENT : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'-')*;

fragment DIGIT : '0'..'9';

does not produce any error or warning. The lexer simply creates two type of tokens:

if something starts with one or more upper case ascii letters followed by zero or more digits, a VAR is created;
if something starts with a lowercase ascii letter or underscore, followed by ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'-')*, a IDENT is created.

Note that therefor an IDENT can never start with an uppercase ascii letter: that will always become a VAR.

So, if you have a parser rule that looks like:

foo
  :  IDENT
  ;

and the entire input is "BAR", then there will be a parser error because the lexer will not produce a INDENT token, but a VAR token, even though the parser “asks” for a IDENT.

You must understand that no matter what the parser asks from the lexer, the lexer operates independently from the parser.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

How can we distinguish a variable name, and an identifer, in an ANTLR grammar?

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply