Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6903805
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T07:58:17+00:00 2026-05-27T07:58:17+00:00

Another simple question : is there any way to tell flex to prefer a

  • 0

Another simple question : is there any way to tell flex to prefer a rule that matches a short thing over a rule that matches a longer thing ? I can’t find any good documentation about that.

Here is why I need that : I parse a file for a pseudo language that contains some keywords corresponding to control instructions. I’d like them to be the absolute priority so that they’re not parsed as parts of an expression. I actually need this priority thing because I don’t have to write a full grammar for my project (that would be totally overkill in my case since I perform structural analysis on the program parsed, I don’t need to know the details…), so I can’t use a fine grammar tuning to be sure that those blocks won’t be parsed into an expression.

Any help will be appreciated.

Here is an example of a file parsed :

If a > 0 Then read(b); Endif
c := "If I were...";
While d > 5 Do d := d + 1 Endwhile

I just want to collect info on the Ifs, Thens, Endifs etc… The rest doesn’t matter to me. That’s why I’d like the Ifs, Thens etc… related rules to be prioritized without to have to write a grammar.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T07:58:17+00:00Added an answer on May 27, 2026 at 7:58 am

    From the Dragon Book 2nd edition, Section 3.5.3 “Conflict Resolution in Lex”:

    We have alluded to the two rules that Lex uses to decide on the proper lexeme
    to select, when several prefixes of the input match one or more patterns:
        1. Always prefer a longer prefix to a shorter prefix.
        2. If the longest possible prefix matches two or more patterns, prefer the
           pattern listed first in the Lex program.
    

    The rule above also applies to Flex. Here is what the Flex manual says (Chapter 7: How the input is matched.)

    When the generated scanner is run, it analyzes its input looking for strings 
    which match any of its patterns. If it finds more than one match, it takes the 
    one matching the most text (for trailing context rules, this includes the length 
    of the trailing part, even though it will then be returned to the input). If it 
    finds two or more matches of the same length, the rule listed first in the flex 
    input file is chosen.
    

    If I understood correctly, your lexer treats keywords like Endif as an identifier, so it will be considered as part of an expression afterwards. If this is your problem, simply put the rules of keywords on top of your specification, such as the following: (suppose each word in uppercase is a predefined enum corresponding to a token)

    "If"                      { return IF;         }
    "Then"                    { return THEN;       }
    "Endif"                   { return ENDIF;      }
    "While"                   { return WHILE;      }
    "Do"                      { return DO;         }
    "EndWhile"                { return ENDWHILE;   }
    \"(\\.|[^\\"])*\"         { return STRING;     }
    [a-zA-Z_][a-zA-Z0-9_]*    { return IDENTIFIER; }
    

    Then the keywords will always matched before the identifier due to Rule No. 2.

    EDIT:

    Thank you for your comment, kol. I forgot to add the rule for string. But I don’t think my solution is wrong. for example, if an identifier called If_this_is_an_identifier, rule 1 will apply, thus the identifier rule will take effect (Since it matches the longest string). I wrote a simple test case and saw no problem in my solution. Here is my lex.l file:

    %{
      #include <iostream>
      using namespace std;
    %}
    
    ID       [a-zA-Z_][a-zA-Z0-9_]*
    
    %option noyywrap
    %%
    
    "If"                      { cout << "IF: " << yytext << endl;         }
    "Then"                    { cout << "THEN: " << yytext << endl;       }
    "Endif"                   { cout << "ENDIF: " << yytext << endl;      }
    "While"                   { cout << "WHILE: " << yytext << endl;      }
    "Do"                      { cout << "DO: " << yytext << endl;         }
    "EndWhile"                { cout << "ENDWHILE: " << yytext << endl;   }
    \"(\\.|[^\\"])*\"         { cout << "STRING: " << yytext << endl;     }
    {ID}                      { cout << "IDENTIFIER: " << yytext << endl; }
    .                         { cout << "Ignore token: " << yytext << endl; }
    
    %%
    
    int main(int argc, char* argv[]) {
      ++argv, --argc;  /* skip over program name */
      if ( argc > 0 )
        yyin = fopen( argv[0], "r" );
      else
        yyin = stdin;
    
      yylex();
    }
    

    I tested my solution with the following test case:

    If If_this_is_an_identifier > 0 Then read(b); Endif
        c := "If I were...";
    While While_this_is_also_an_identifier > 5 Do d := d + 1 Endwhile
    

    and it gives me the following output (other output not relevant to the problem you mentioned is ignored.)

    IF: If
    IDENTIFIER: If_this_is_an_identifier
    ......
    STRING: "If I were..."
    ......
    WHILE: While
    IDENTIFIER: While_this_is_also_an_identifier
    

    The lex.l program is modified base on an example from the flex manual: (which use the same method to match keyword out of identifiers)

    Also have a look at the ANSI C grammar, Lex specification.

    I also used this approach in my personal project, and so far I didn’t find any problem.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

is there a simple way to persist some of the fields in another class
I want to write a simple SMPP Server that basically forwards traffic to another
Simple (I hope), HTML question. Let's say I have a column group that spans
Very simple question but all the answer I read over the web doesn't apply.
I have another question about restoring application state, on the iPhone. Simple data (like
I am wondering if there is a quick simple way to check whether a
Another jquery calculation question. I've this, which is sample code from the plugin site
I have a simple script which is used to start another program. This other
I have another csv file where I am trying to do a simple word
This is a simple one. I want to replace a sub-string with another sub-string

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.