I am trying to build a Lisp grammar. Easy, right? Apparently not.
I present these inputs and receive errors…
( 1 1) 23 23 23 ui ui
This is the grammar…
%% sexpr: atom {printf('matched sexpr\n');} | list ; list: '(' members ')' {printf('matched list\n');} | '('')' {printf('matched empty list\n');} ; members: sexpr {printf('members 1\n');} | sexpr members {printf('members 2\n');} ; atom: ID {printf('ID\n');} | NUM {printf('NUM\n');} | STR {printf('STR\n');} ; %%
As near as I can tell, I need a single non-terminal defined as a program, upon which the whole parse tree can hang. But I tried it and it didn’t seem to work.
edit – this was my ‘top terminal’ approach:
program: slist; slist: slist sexpr | sexpr;
But it allows problems such as:
( 1 1
Edit2: The FLEX code is…
%{ #include <stdio.h> #include 'a.yacc.tab.h' int linenumber; extern int yylval; %} %% \n { linenumber++; } [0-9]+ { yylval = atoi(yytext); return NUM; } \'[^\'\n]*\' { return STR; } [a-zA-Z][a-zA-Z0-9]* { return ID; } . %%
An example of the over-matching…
(1 1 1) NUM matched sexpr NUM matched sexpr NUM matched sexpr (1 1 NUM matched sexpr NUM matched sexpr
What’s the error here?
edit: The error was in the lexer.
The error is really in the lexer. Your parentheses end up as the last ‘.’ in the lexer, and don’t show up as parentheses in the parser.
Add rules like
to the lexer and change all occurences of ‘(‘, ‘)’ to LPAREN and RPAREN respectively in the parser. (also, you need to #define LPAREN and RPAREN where you define your token list)
Note: I’m not sure about the syntax, could be the backslashes are wrong.