I try to write the Xtext BNF for Configuration files (known with the .ini extension)
For instance, I’d like to successfully parse
[Section1]
a = Easy123
b = This *is* valid too
[Section_2]
c = Voilà # inline comments are ignored
My problem is matching the property value (what’s on the right of the ‘=’).
My current grammar works if the property matches the ID terminal (eg a = Easy123).
PropertyFile hidden(SL_COMMENT, WS):
sections+=Section*;
Section:
'[' name=ID ']'
(NEWLINE properties+=Property)+
NEWLINE+;
Property:
name=ID (':' | '=') value=ID ';'?;
terminal WS:
(' ' | '\t')+;
terminal NEWLINE:
// New line on DOS or Unix
'\r'? '\n';
terminal ID:
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
terminal SL_COMMENT:
// Single line comment
'#' !('\n' | '\r')*;
I don’t know how to generalize the grammar to match any text (eg c = Voilà).
I certainly need to introduce a new terminal
Property:
name=ID (‘:’ | ‘=’) value=TEXT ‘;’?;
Question is: how should I define this TEXT terminal?
I have tried
-
terminal TEXT: ANY_OTHER+;
This raises a warningThe following token definitions can never be matched because prior tokens match the same input: RULE_INT,RULE_STRING,RULE_ML_COMMENT,RULE_ANY_OTHER
(I think it doesn’t matter).
Parsing Fails with
Required loop (…)+ did not match anything at input ‘à’
-
terminal TEXT: !('\r'|'\n'|'#')+;
This raises a warningThe following token definitions can never be matched because prior tokens match the same input: RULE_INT
(I think it doesn’t matter).
Parsing Fails with
Missing EOF at [Section1]
-
terminal TEXT: ('!'|'$'..'~');(which covers most characters, except#and")
No warning during the generation of the lexer/parser.
However Parsing Fails withMismatch input ‘Easy123’ expecting RULE_TEXT
Extraneous input ‘This’ expecting RULE_TEXT
Required loop (…)+ did not match anything at ‘is’
Thanks for your help (and I hope this grammar can be useful for others too)
This grammar does the trick:
Key is, that you do not try to cover the complete semantics only in the grammar but take other services into account, too. The terminal rule
PROPERTY_VALUEconsumes the complete value including leading assignment and optional trailing semicolon.Now just register a value converter service for that language and take care of the insignificant parts of the input, there:
The follow test case will succeed, after you registered the service in the runtime module like this:
Test case: