I have a lexical rule (Integer) which uses some fragments. In a parser rule (parse) I want to rewrite my tree differently depending on which fragment generated the token in question. I have made a small grammar to demonstrate what I’m attempting:
grammar subrange;
options {
output=AST;
}
tokens {
NumberNode;
DecimalNode;
BinaryNode;
HexNode;
OctalNode;
}
parse
: Integer+ -> ^(NumberNode Integer)+
;
Integer
: DECIMAL_LITERAL
| BINARY_LITERAL
| HEX_LITERAL
| OCTAL_LITERAL
;
fragment BINARY_LITERAL
: '2#' ('0' | '1')+
;
fragment HEX_LITERAL
: ('16#' | '0' ('x'|'X')) HEX_DIGIT+
;
fragment HEX_DIGIT
: (DIGIT|'a'..'f'|'A'..'F')
;
fragment DECIMAL_LITERAL
: ('0' | '1'..'9' DIGIT*)
;
fragment OCTAL_LITERAL
: '8#' ('0'..'7')+
;
fragment DIGIT
: '0'..'9'
;
SPACE : (' ' | '\t' | '\r' | '\n')+ {skip();};
I want the parse rule to rewrite a DECIMAL_LITERAL under an imaginary DecimalNode but a BINARY_LITERAL under a BinaryNode (rather than everything under a NumberNode).
I’m attempting to do this by changing the token type inside the lexical rule so that I can then rewrite accordingly inside the parse rule.
I think I should be able to do this with an action but I have been unable to figure out how to find the returned token in order to change its type. http://www.antlr.org/wiki/display/ANTLR3/Special+symbols+in+actions seems to indicate that $tokenref should work but it doesn’t get translated at all.
Or is there another way to accomplish this?
Thanks in advance.
It seems a bit odd to me: grouping all such literals under a single
Integertoken, and then, in a parser rule you want to separate them again.Why not just remove
Integerand do:?
Or you could keep the
Int(eger)rule but set the numerical value of the various int-literals by doing:Be careful giving rules a name as some object/class/reserved-word of the target language can have (
Integerin case of Java).EDIT
Okay. I’ll leave my other answer there in case passers-by are wondering why on earth I’m proposing this… 🙂
Here’s what (I think) you’re after:
Parsing the input
"2#1111 8#77 0xff 16#ff 123"will result in the following AST:Since you’ve lost the information about what type of
Integereach literal is, you will have to do this check in theinteger-rule (the-> {boolean-expression}? ...things after the rewrite rules).