I’m trying out antlr4 with a somewhat large grammar that worked in antlr3. Worked

Question

0

Asked: June 17, 20262026-06-17T11:24:52+00:00 2026-06-17T11:24:52+00:00

I’m trying out antlr4 with a somewhat large grammar that worked in antlr3. Worked

0

I’m trying out antlr4 with a somewhat large grammar that worked in antlr3. Worked through 2 grammar changes needed and now I have the tool producing the lexer and parser.

However, the lexer has a compile error:

1) The type generates a string that requires more than 65535 bytes to
encode in Utf8 format in the constant pool

The error shows up in Eclipse on the class name, so not sure exactly which string it is talking about, but I suspect it is this very long String:

    public static final String _serializedATN =
        "\1\2\u01c5\u1741\6\uffff\2\0\7\0\2\1\7\1\2\2\7\2\2\3\7\3\2\4\7\4\2\5\7"+
        "\5\2\6\7\6\2\7\7\7\2\b\7\b\2\t\7\t\2\n\7\n\2\13\7\13\2\f\7\f\2\r\7\r\2"+
... etc, etc (few hundred lines of unicode)

Looks like a bug in the parser generator, but possible there is some new setting required for antlr4 I’m not aware of (?)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T11:24:52+00:00

This is really a limitation in Java, not a bug in ANTLR (the correct serialization string is created, but Java’s encoding can’t store it). Last week we tweaked the _serializedATN representation to help with this problem, but we have not implemented a complete workaround involving breaking the serialized form into multiple strings or allowing its storage in a separate file loaded at runtime.

There may be some ways to tweak the grammar to reduce the size of the required ATN, but I would need to see the grammar to evaluate that.

Update: Starting with ANTLR 4.1, _serializedATN is now split as necessary to ensure the constant pool limit is not exceeded in the generated code. See issue 76 for details.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying out antlr4 with a somewhat large grammar that worked in antlr3. Worked

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply