The ANTLR website describes two approaches to implementing include directives. The first approach is

Question

0

Asked: June 10, 20262026-06-10T04:55:28+00:00 2026-06-10T04:55:28+00:00

The ANTLR website describes two approaches to implementing include directives. The first approach is

0

The ANTLR website describes two approaches to implementing “include” directives. The first approach is to recognize the directive in the lexer and include the file lexically (by pushing the CharStream onto a stack and replacing it with one that reads the new file); the second is to recognize the directive in the parser, launch a sub-parser to parse the new file, and splice in the AST generated by the sub-parser. Neither of these are quite what I need.

In the language I’m parsing, recognizing the directive in the lexer is impractical for a few reasons:

There is no self-contained character pattern that always means “this is an include directive”. For example, Include "foo"; at top level is an include directive, but in Array bar --> Include "foo"; or Constant Include "foo"; the word Include is an identifier.
The name of the file to include may be given as a string or as a constant identifier, and such constants can be defined with arbitrarily complex expressions.

So I want to trigger the inclusion from the parser. But to perform the inclusion, I can’t launch a sub-parser and splice the AST together; I have to splice the tokens. It’s legal for a block to begin with { in the main file and be terminated by } in the included file. A file included inside a function can even close the function definition and start a new one.

It seems like I’ll need something like the first approach but at the level of TokenStreams instead of CharStreams. Is that a viable approach? How much state would I need to keep on the stack, and how would I make the parser switch back to the original token stream instead of terminating when it hits EOF? Or is there a better way to handle this?

==========

Here’s an example of the language, demonstrating that blocks opened in the main file can be closed in the included file (and vice versa). Note that the # before Include is required when the directive is inside a function, but optional outside.

main.inf:

[ Main;
  print "This is Main!";
  if (0) {
  #include "other.h";
  print "This is OtherFunction!";
];

other.h:

  } ! end if
];  ! end Main

[ OtherFunction;

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T04:55:29+00:00

A possibility is for each Include statement to let your parser create a new instance of your lexer and insert these new tokens the lexer creates at the index the parser is currently at (see the insertTokens(...) method in the parser’s @members block.).

Here’s a quick demo:

Inform6.g

grammar Inform6;

options {
  output=AST;
}

tokens {
  STATS;
  F_DECL;
  F_CALL;
  EXPRS;
}

@parser::header {
  import java.util.Map;
  import java.util.HashMap;
}

@parser::members {
  private Map<String, String> memory = new HashMap<String, String>(); 

  private void putInMemory(String key, String str) {
    String value;
    if(str.startsWith("\"")) {
      value = str.substring(1, str.length() - 1);
    }
    else {
      value = memory.get(str);
    }
    memory.put(key, value);
  }

  private void insertTokens(String fileName) {
    // possibly strip quotes from `fileName` in case it's a Str-token
    try {
      CommonTokenStream thatStream = new CommonTokenStream(new Inform6Lexer(new ANTLRFileStream(fileName)));
      thatStream.fill();
      List extraTokens = thatStream.getTokens();
      extraTokens.remove(extraTokens.size() - 1); // remove EOF
      CommonTokenStream thisStream = (CommonTokenStream)this.getTokenStream();
      thisStream.getTokens().addAll(thisStream.index(), extraTokens);
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
}

parse
 : stats EOF -> stats
 ;

stats
 : stat* -> ^(STATS stat*)
 ;

stat
 : function_decl
 | function_call
 | include
 | constant
 | if_stat
 ;

if_stat
 : If '(' expr ')' '{' stats '}' -> ^(If expr stats)
 ;

function_decl
 : '[' id ';' stats ']' ';' -> ^(F_DECL id stats)
 ;

function_call
 : Id exprs ';' -> ^(F_CALL Id exprs)
 ;

include
 : Include Str ';' {insertTokens($Str.text);}            -> /* omit statement from AST */
 | Include id ';'  {insertTokens(memory.get($id.text));} -> /* omit statement from AST */
 ;

constant
 : Constant id expr ';' {putInMemory($id.text, $expr.text);} -> ^(Constant id expr)
 ;

exprs
 : expr (',' expr)* -> ^(EXPRS expr+)
 ;

expr
 : add_expr
 ;

add_expr
 : mult_expr (('+' | '-')^ mult_expr)*
 ;

mult_expr
 : atom (('*' | '/')^ atom)*
 ;

atom
 : id
 | Num
 | Str
 | '(' expr ')' -> expr
 ;

id
 : Id
 | Include
 ;

Comment  : '!' ~('\r' | '\n')* {skip();};
Space    : (' ' | '\t' | '\r' | '\n')+ {skip();};
If       : 'if';
Include  : 'Include';
Constant : 'Constant';
Id       : ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9')+;
Str      : '"' ~'"'* '"';
Num      : '0'..'9'+ ('.' '0'..'9'+)?;

main.inf

Constant IMPORT "other.h";

[ Main;
  print "This is Main!";
  if (0) {    

  Include IMPORT;

  print "This is OtherFunction!";
];

other.h

  } ! end if
];  ! end Main

[ OtherFunction;

Main.java

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    // create lexer & parser
    Inform6Lexer lexer = new Inform6Lexer(new ANTLRFileStream("main.inf"));
    Inform6Parser parser = new Inform6Parser(new CommonTokenStream(lexer));

    // print the AST
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT((CommonTree)parser.parse().getTree());
    System.out.println(st);
  }
}

To run the demo, do the following on the command line:

java -cp antlr-3.3.jar org.antlr.Tool Inform6.g 
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main

The output you’ll see corresponds to the following AST:

enter image description here

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

The ANTLR website describes two approaches to implementing include directives. The first approach is

Leave an answerCancel reply

1 Answer

Inform6.g

main.inf

other.h

Main.java

Leave an answer
Cancel reply