I have written a complex grammar. The grammar can be seen below:
grammar i;
options {
output=AST;
}
@header {
package com.data;
}
operatorLogic : 'AND' | 'OR';
value : STRING;
query : (select)*;
select : 'SELECT'^ functions 'FROM table' filters?';';
operator : '=' | '!=' | '<' | '>' | '<=' | '>=';
filters : 'WHERE'^ conditions;
conditions : (members (operatorLogic members)*);
members : STRING operator value;
functions : '*';
STRING : ('a'..'z'|'A'..'Z')+;
WS : (' '|'\t'|'\f'|'\n'|'\r')+ {skip();}; // handle white space between keywords
The output is done using AST. The above is only a small sample. However, I am developing some big grammar and need advice on how to approach this.
For example according to the above grammar the following can be produced:
SELECT * from table;
SELECT * from table WHERE name = i AND name = j;
This query could get more complex. I have implemented AST in the Java code and can get the Tree back. I wanted to seperate the grammar and logic, so their are cohesive. So AST was the best approach.
The user will enter a query as a String and my code needs to handle the query in the best way possible. As you can see the functions parser currently is * which means select all. In the future this could expand to include other things.
How can my code handle this? What’s the best approach?
I could do something like this:
String input = "SELECT * from table;";
if(input.startsWith("SELECT")) {
select();
}
As you can see this approach is more complicated, as I need to handle * also the optional filters. The operatorLogic which is AND and OR, also needs to be done.
What is the best way? I have looked online, but couldn’t find any example on how to handle this.
Are you able to give any examples?
EDIT:
String input = "SELECT * FROM table;";
if(input.startsWith("SELECT")) {
select();
}
else if(input.startsWith("SELECT *")) {
findAll();
}
The easiest way to handle multiple starting rules (“SELECT …”, “UPDATE…”, etc) is to let the ANTLR grammar do the work for you at a single, top-level starting rule. You pretty much have that already, so it’s just a matter of updating what you have.
Currently your grammar is limited to one command-type of input (“SELECT…”) because that’s all you’ve defined:
If
queryis your starting rule, then accepting additional top-level input is a matter of definingqueryto accept more thanselect:Now the
queryrule can handle input such asSELECT * FROM table;,UPDATE;, orSELECT * FROM table; UPDATE;. When a new top-level rule is added, just updatequeryto test for that new rule. This way your Java code doesn’t need to test the input, it just calls thequeryrule and lets the parser handle the rest.If you only want one type of input to be processed from the input, define
querylike this:The rule
querystill handlesSELECT * FROM table;andUPDATE;, but not a mix of commands, likeSELECT * FROM table; UPDATE;.Once you get your
query_returnAST tree from callingquery, you now have something meaningful that your Java code can process, instead of a string. That tree represents all the input that the parser processed.You can walk through the children of the tree like so:
Walking through the entire AST tree is a matter of recursively calling
getChild(...)on all parent nodes (my example above looks at the top-level children only).Handling alternatives to
*is no different than any other alternatives you’ve defined: just define the alternatives in the rule you want to expand. If you wantfunctionsto accept more than*, definefunctionsto accept more than*. 😉Here’s an example:
Now the parser can accept
SELECT * FROM table;andSELECT foobar FROM table;.Remember that your Java code has no reason to examine the input string. Whenever you’re tempted to do that, look for a way to make your grammar do the examining instead. Your Java code will then look at the AST tree output for whatever it wants.