Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3963842
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 20, 20262026-05-20T03:15:47+00:00 2026-05-20T03:15:47+00:00

I am trying to create a grammar which accepts any character or number or

  • 0

I am trying to create a grammar which accepts any character or number or just about anything, provided its length is equal to 1.

Is there a function to check the length?

EDIT

Let me make my question more clear with an example.
I wrote the following code:

grammar first;

tokens {
    SET =   'set';
    VAL =   'val';
    UND =   'und';
    CON =   'con';
    ON  =   'on';
    OFF =   'off';
}

@parser::members {
  private boolean inbounds(Token t, int min, int max) {
    int n = Integer.parseInt(t.getText());
    return n >= min && n <= max;
  }
}

parse   :   SET expr;

expr    :   VAL('u'('e')?)? String |
        UND('e'('r'('l'('i'('n'('e')?)?)?)?)?)? (ON | OFF) |
        CON('n'('e'('c'('t')?)?)?)? oneChar
    ;

CHAR    :   'a'..'z';

DIGIT   :   '0'..'9';

String  :   (CHAR | DIGIT)+;

dot :   .;

oneChar :   dot { $dot.text.length() == 1;} ;

Space  : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};

I want my grammar to do the following things:

  1. Accept commands like: ‘set value abc’ , ‘set underli on’ , ‘set conn #’. The grammar should be intelligent enough to accept incomplete words like ‘underl’ instead of ‘underline. etc etc.
  2. The third syntax: ‘set connect oneChar’ should accept any character, but just one character. It can be a numeric digit or alphabet or any special character. I am getting a compiler error in the generated parser file because of this.
  3. The first syntax: ‘set value’ should accept all the possible strings, even on and off. But when I give something like: ‘set value offer’, the grammar is failing. I think this is happening because I already have a token ‘OFF’.

In my grammar all the three requirements I have listed above are not working fine. Don’t know why.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-20T03:15:48+00:00Added an answer on May 20, 2026 at 3:15 am

    There are some mistakes and/or bad practices in your grammar:


    #1

    The following is not a validating predicate:

    {$dot.text.length() == 1;}
    

    A proper validating predicate in ANTLR has a question mark at the end, and the inner code has no semi colon at the end. So it should be:

    {$dot.text.length() == 1}?
    

    instead.


    #2

    You should not be handling these alternative commands:

    expr
      :  VAL('u'('e')?)? String 
      |  UND('e'('r'('l'('i'('n'('e')?)?)?)?)?)? (ON | OFF) 
      |  CON('n'('e'('c'('t')?)?)?)? oneChar
      ;
    

    in a parser rule. You should let the lexer handle this instead. Something like this will do it:

    expr
      :  VAL String
      |  UND (ON | OFF)
      |  CON oneChar
      ;
    
    // ...
    
    VAL : 'val' ('u' ('e')?)?;
    UND : 'und' ( 'e' ( 'r' ( 'l' ( 'i' ( 'n' ( 'e' )?)?)?)?)?)?;
    CON : 'con' ( 'n' ( 'e' ( 'c' ( 't' )?)?)?)?;
    

    (also see #5!)


    #3

    Your lexer rules:

    CHAR    :   'a'..'z';
    DIGIT   :   '0'..'9';  
    String  :   (CHAR | DIGIT)+;
    

    are making things complicated for you. The lexer can produce three different kind of tokens because of this: CHAR, DIGIT or String. Ideally, you should only create String tokens since a String can already be a single CHAR or DIGIT. You can do that by adding the fragment keyword before these rules:

    fragment CHAR  : 'a'..'z' | 'A'..'Z';
    fragment DIGIT : '0'..'9';
    String : (CHAR | DIGIT)+;
    

    There will now be no CHAR and DIGIT tokens in your token stream, only String tokens. In short: fragment rules are only used inside lexer rules, by other lexer rules. They will never be tokens of their own (and can therefor never appear in any parser rule!).


    #4

    The rule:

    dot :   .;
    

    does not do what you think it does. It matches “any token”, not “any character”. Inside a lexer rule, the . matches any character but in parser rules, it matches any token. Realize that parser rules can only make use of the tokens created by the lexer.

    The input source is first tokenized based on the lexer-rules. After that has been done, the parser (though its parser rules) can then operate on these tokens (not characters!!!). Make sure you understand this! (if not, ask for clarification or grab a book about ANTLR)

    – an example –

    Take the following grammar:

    p : . ;
    A : 'a' | 'A';
    B : 'b' | 'B';
    

    The parser rule p will now match any token that the lexer produces: which is only a A– or B-token. So, p can only match one of the characters 'a', 'A', 'b' or 'B', nothing else.

    And in the following grammar:

    prs : . ;
    FOO : 'a';
    BAR : . ;
    

    the lexer rule BAR matches any single character in the range \u0000 .. \uFFFF, but it can never match the character 'a' since the lexer rule FOO is defined before the BAR rule and captures this 'a' already. And the parser rule prs again matches any token, which is either FOO or BAR.


    #5

    Putting single characters like 'u' inside your parser rules, will cause the lexer to tokenize an u as a separate token: you don’t want that. Also, by putting them in parser rules, it is unclear which token has precedence over other tokens. You should keep all such literals outside your parser rules and make them explicit lexer rules instead. Only use lexer rules in your parser rules.

    So, don’t do:

    pRule  : 'u' ':' String
    String : ...
    

    but do:

    pRule  : U ':' String
    U      : 'u';
    String : ...
    

    You could make ':' a lexer rule, but that is of less importance. The 'u' however can also be a String so it must appear as a lexer rule before the String rule.


    Okay, those were the most obvious things that come to mind. Based on them, here’s a proposed grammar:

    grammar first;
    
    parse
      :  (SET expr {System.out.println("expr = " + $expr.text);} )+ EOF
      ;
    
    expr
      :  VAL String    {System.out.print("A :: ");}
      |  UL (ON | OFF) {System.out.print("B :: ");}
      |  CON oneChar   {System.out.print("C :: ");}
      ;
    
    oneChar 
      :  String {$String.text.length() == 1}?
      ;
    
    SET : 'set';
    VAL : 'val' ('u' ('e')?)?;
    UL  : 'und' ( 'e' ( 'r' ( 'l' ( 'i' ( 'n' ( 'e' )?)?)?)?)?)?;
    CON : 'con' ( 'n' ( 'e' ( 'c' ( 't' )?)?)?)?;
    ON  : 'on';
    OFF : 'off';
    
    String : (CHAR | DIGIT)+;
    
    fragment CHAR  : 'a'..'z' | 'A'..'Z';
    fragment DIGIT : '0'..'9';
    
    Space : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;};
    

    that can be tested with the following class:

    import org.antlr.runtime.*;
    
    public class Main {
        public static void main(String[] args) throws Exception {
            String source = 
                    "set value abc  \n" + 
                    "set underli on \n" + 
                    "set conn x     \n" + 
                    "set conn xy      ";
            ANTLRStringStream in = new ANTLRStringStream(source);
            firstLexer lexer = new firstLexer(in);
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            firstParser parser = new firstParser(tokens);
            System.out.println("parsing:\n======\n" + source + "\n======");
            parser.parse();
        }
    }
    

    which, after generating the lexer and parser:

    java -cp antlr-3.2.jar org.antlr.Tool first.g 
    javac -cp antlr-3.2.jar *.java
    java -cp .:antlr-3.2.jar Main
    

    prints the following output:

    parsing:
    ======
    set value abc  
    set underli on 
    set conn x     
    set conn xy      
    ======
    A :: expr = value abc
    B :: expr = underli on
    C :: expr = conn x
    line 0:-1 rule oneChar failed predicate: {$String.text.length() == 1}?
    C :: expr = conn xy
    

    As you can see, the last command, C :: expr = conn xy, produces an error, as expected.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to create a grammar for multiplying and dividing numbers in which the
I am trying to create a grammar that accepts two ranges of integer: integer1
I'm trying to create a simple BaSH-like grammar on ANTLRv3 but haven't been able
I've been trying to create a parser using simpleparse. I've defined the grammar like
I'm trying create a bot which automatically likes Facebook posts. Using Mechanize I can
I'm trying to compile a simple grammar which I created with ANTLR but I
I'm trying to create a very simple grammar to learn to use ANTLR but
I'm trying create this function such that if any key besides any of the
I have the following spirit grammar. I am trying to create a vector of
Im trying to create a parser which sould transalte english sentences into drawn shapes

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.