I have a script language based on Antlr: A parser and a tree grammar

Question

0

Asked: June 3, 20262026-06-03T14:10:53+00:00 2026-06-03T14:10:53+00:00

I have a script language based on Antlr: A parser and a tree grammar

0

I have a script language based on Antlr: A parser and a tree grammar that builds runtime objects (e.g. statements). When I deal with the statements at runtime, I want to know the original source positions (e.g. when I throw errors, I want to state the line and position in the script source.)

What is the best strategy to attach the source positions to my runtime objects? And if I’m not asking too much, I want to have as little impact on my grammar files as possible.

I have tried to put as little code into the grammar as possible to increase quality, e.g. one of my (many) expressions looks like this:

multiplyExpression returns [Expression value]
: ^('*' l=expression r=expression)
{
    $value = sb.newBinaryExpression(CorIdentifier.MULTIPLY, $l.value, $r.value);
}
;

where sbis my ScriptBuilder that acts as an adapter between the generated code and my runtime. I know I can add the source position as an additional parameter to newBinaryExpressionbut then I have to touch all other expressions as well. I was hoping that I can put the token stream into sb only once and fetch the source position from the stream without affecting the grammar source at all.

I was hoping that, since Antlr is used by many scripting languages, there is a standard way to handle this since source position handling is a single aspect and I don’t want to have it cluttered all over the grammar file, not very DRY.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T14:10:54+00:00

I was hoping that, since Antlr is used by many scripting languages, there is a standard way to handle this

You make it sound like ANLTR does not support this. Sure there is: every CommonToken and CommonTree objects exposes public getLine() and getCharPositionInLine() methods, but you discard these instances and create your own nodes (Expression). Don’t be surprised to make some extra effort in embedding this information in your own nodes 🙂

You could let your runtime objects extend CommonTree classes and let your (combined) grammar construct these custom runtime objects (your classes now inherit the getLine() and getCharPositionInLine() methods). See: Using custom AST node types.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a script language based on Antlr: A parser and a tree grammar

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply