Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8993423
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T23:11:33+00:00 2026-06-15T23:11:33+00:00

So I’m writing a compiler in Java using ANTLR, and I’m a little puzzled

  • 0

So I’m writing a compiler in Java using ANTLR, and I’m a little puzzled by how it deals with errors.

The default behavior seems to be to print an error message and then attempt, by means of token insertion and such, to recover from the error and continue parsing. I like this in principle; it means that (in the best case) if the user has committed more than one syntax error, they’ll get one message per error, but it’ll mention all the errors instead of forcing them to recompile to discover the next one. The default error message is fine for my purposes. The trouble comes when it’s done reading all the tokens.

I am, of course, using ANTLR’s tree constructors to build abstract syntax trees. While it’s nice for the parse to continue through syntax errors so the user can see all the errors, once it’s done parsing I want to get an exception or some kind of indication that the input wasn’t syntactically valid; that way I can stop the compilation and tell the user “sorry, fix your syntax errors and then try again”. What I don’t want is for it to spit out an incomplete AST based on what it thinks the user was trying to say, and continue to the next phase of compilation with no indication that anything went wrong (other than the error messages which went to the console and I can’t see). Yet by default, it does exactly that.

The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a @rulecatch action to do the same. This would seem to lose the benefit of recovering from parse errors, but more importantly, it only partially works. If a necessary token is missing (for instance, if a binary operator only has an expression on one side of it), it throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way, producing an AST with no indication of a syntax error except a console message. (To make matters worse, the token it inserted was EOF, so the rest of the file didn’t even get parsed.)

I’m sure I could fix this by, say, adding something like an isValid field to the parser and overriding methods and adding actions so that, at the end of the parse, it throws an exception if there were any errors. But is there a better way? I can’t imagine that what I’m trying to do is unusual among ANTLR users.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T23:11:34+00:00Added an answer on June 15, 2026 at 11:11 pm

    … [O]nce it’s done parsing I want to get an exception or some kind of indication that the input wasn’t syntactically valid; that way I can stop the compilation…

    You can call getNumberOfSyntaxErrors on both the lexer and the parser after parsing to determine if there was an error that was covertly accommodated by ANTLR. This doesn’t tell you what those errors were, obviously, but I think these methods address the “once it’s done parsing … stop the compilation” part of your question.

    The Definitive ANTLR Reference offers a technique to stop parsing as soon as a syntax error is detected: override the mismatch and recoverFromMismatchedSet methods to throw RecognitionExceptions, and add a @rulecatch action to do the same.

    I don’t think you mentioned which version of ANTLR you’re using, but the documentation in the ANTLR v3.4 code for the method recoverFromMismatchedSet says it’s “not currently used” and an Eclipse “global usage” scan found no callers. Neither here nor there to your main problem, but I wanted to mention it for the record. It may be the correct method to override for your version.

    If a necessary token is missing …, [the overridden code] throws an exception just as expected, but if an extraneous token is added, ANTLR inserts the token that it thinks belongs there and continues on its merry way…

    Method recoverFromMismatchedToken tests for a recoverable missing and extraneous token by delegating to methods mismatchIsMissingToken and mismatchIsUnwantedToken respectively. If the appropriate method determines that an insertion or deletion will solve the problem, recoverFromMismatchedToken makes the appropriate correction. If it is determined that no operation solves the mismatched token problem, recoverFromMismatchedToken throws a MismatchedTokenException.

    If a recovery operation takes place, reportError is called, which calls displayRecognitionError with the details.

    This applies to ANTLR v3.4 and possibly earlier versions.

    This gives you at least two options:

    • Override recoverFromMismatchedToken and handle errors at a fine-grained level. From here you can delegate the call to the super implementation, roll your own recovery code, or bail out with an exception. Whatever the case, your code will be called and thus will be aware that a mismatch error occurred, recoverable or otherwise. This option is probably equivalent to overriding recoverFromMismatchedSet.

    • Override displayRecognitionError and handle the errors at a course-grained level. Method reportError does some state juggling, so I wouldn’t recommend overriding it unless the overriding implementation calls the super-implementation. Method displayRecognitionError appears to be one of the last calls in the recovered-token call chain, so it would be a reasonable place to determine whether or not to continue. I would prefer it had a name that indicated that it was a reasonable place for that, but oh well. Here is an answer that demonstrates this option.

    I’m partial towards overriding displayRecognitionError because it provides the error message text easily enough and because I know it’s going to be called only after a token recovery operation and required state juggling — no need for my parser to figure out how to recover for itself. This coupled with getNumberOfSyntaxErrors appear to give you the options that you’re looking for, assuming that you’re working with a relevant version of ANTLR and that I fully understood your problem.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have thousands of HTML files to process using Groovy/Java and I need to
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I am reading a book about Javascript and jQuery and using one of the
I am using the SimpleRSS gem to parse a WordPress RSS feed. The only
I'm using v2.0 of ClassTextile.php, with the following call: $testimonial_text = $textile->TextileRestricted($_POST['testimonial']); ... and
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
We're building an app, our first using Rails 3, and we're having to build
We are using XSLT to translate a RIXML file to XML. Our RIXML contains

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.