I’m trying to come up with a grammar for the x format in bison (actually a subset of the subset that the blender python script exports).But I’m getting a shift reduce conflict and Bison won’t tell me where, or “sort of” tells me where, depending on what I try.
First, here’s a sample .x file from blender, which is what I’m testing on:
xof 0303txt 0032
Frame Root {
FrameTransformMatrix {
1.000000, 0.000000, 0.000000, 0.000000,
0.000000, 0.000000, 1.000000, 0.000000,
0.000000, 1.000000,-0.000000, 0.000000,
0.000000, 0.000000, 0.000000, 1.000000;;
}
Frame Cube {
FrameTransformMatrix {
1.000000, 0.000000, 0.000000, 0.000000,
0.000000, 1.000000, 0.000000, 0.000000,
0.000000, 0.000000, 1.000000, 0.000000,
0.000000, 0.000000, 0.000000, 1.000000;;
}
Mesh { //Cube_002 Mesh
36;
-1.000000;-1.000000;-1.000000;,
1.000000;-1.000000;-1.000000;,
1.000000; 1.000000;-1.000000;,
0.999999;-1.000001; 1.000000;,
-1.000000; 1.000000; 1.000000;,
1.000000; 0.999999; 1.000000;,
1.000000;-1.000000;-1.000000;,
1.000000; 0.999999; 1.000000;,
1.000000; 1.000000;-1.000000;,
-1.000000;-1.000000;-1.000000;,
0.999999;-1.000001; 1.000000;,
1.000000;-1.000000;-1.000000;,
-1.000000; 1.000000; 1.000000;,
-1.000000;-1.000000; 1.000000;,
-1.000000;-1.000000;-1.000000;,
-1.000000; 1.000000;-1.000000;,
1.000000; 1.000000;-1.000000;,
1.000000; 0.999999; 1.000000;,
-1.000000;-1.000000;-1.000000;,
1.000000; 1.000000;-1.000000;,
-1.000000; 1.000000;-1.000000;,
0.999999;-1.000001; 1.000000;,
-1.000000;-1.000000; 1.000000;,
-1.000000; 1.000000; 1.000000;,
1.000000;-1.000000;-1.000000;,
0.999999;-1.000001; 1.000000;,
1.000000; 0.999999; 1.000000;,
-1.000000;-1.000000;-1.000000;,
-1.000000;-1.000000; 1.000000;,
0.999999;-1.000001; 1.000000;,
-1.000000; 1.000000; 1.000000;,
-1.000000;-1.000000;-1.000000;,
-1.000000; 1.000000;-1.000000;,
-1.000000; 1.000000;-1.000000;,
1.000000; 0.999999; 1.000000;,
-1.000000; 1.000000; 1.000000;;
12;
3;0;1;2;,
3;3;4;5;,
3;6;7;8;,
3;9;10;11;,
3;12;13;14;,
3;15;16;17;,
3;18;19;20;,
3;21;22;23;,
3;24;25;26;,
3;27;28;29;,
3;30;31;32;,
3;33;34;35;;
MeshNormals { //Cube_002 Normals
36;
0.000000; 0.000000;-1.000000;,
0.000000; 0.000000;-1.000000;,
0.000000; 0.000000;-1.000000;,
-0.000000;-0.000000; 1.000000;,
-0.000000;-0.000000; 1.000000;,
-0.000000;-0.000000; 1.000000;,
1.000000; 0.000000;-0.000000;,
1.000000; 0.000000;-0.000000;,
1.000000; 0.000000;-0.000000;,
-0.000000;-1.000000;-0.000000;,
-0.000000;-1.000000;-0.000000;,
-0.000000;-1.000000;-0.000000;,
-1.000000; 0.000000;-0.000000;,
-1.000000; 0.000000;-0.000000;,
-1.000000; 0.000000;-0.000000;,
0.000000; 1.000000; 0.000000;,
0.000000; 1.000000; 0.000000;,
0.000000; 1.000000; 0.000000;,
0.000000; 0.000000;-1.000000;,
0.000000; 0.000000;-1.000000;,
0.000000; 0.000000;-1.000000;,
0.000000;-0.000000; 1.000000;,
0.000000;-0.000000; 1.000000;,
0.000000;-0.000000; 1.000000;,
1.000000;-0.000001; 0.000000;,
1.000000;-0.000001; 0.000000;,
1.000000;-0.000001; 0.000000;,
-0.000000;-1.000000; 0.000000;,
-0.000000;-1.000000; 0.000000;,
-0.000000;-1.000000; 0.000000;,
-1.000000; 0.000000;-0.000000;,
-1.000000; 0.000000;-0.000000;,
-1.000000; 0.000000;-0.000000;,
0.000000; 1.000000; 0.000000;,
0.000000; 1.000000; 0.000000;,
0.000000; 1.000000; 0.000000;;
12;
3;0;1;2;,
3;3;4;5;,
3;6;7;8;,
3;9;10;11;,
3;12;13;14;,
3;15;16;17;,
3;18;19;20;,
3;21;22;23;,
3;24;25;26;,
3;27;28;29;,
3;30;31;32;,
3;33;34;35;;
} //End of Cube_002 Normals
MeshMaterialList { //Cube_002 Material List
1;
12;
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0;;
Material Material {
0.640000; 0.640000; 0.640000; 1.000000;;
96.078431;
0.500000; 0.500000; 0.500000;;
0.000000; 0.000000; 0.000000;;
}
} //End of Cube_002 Material List
} //End of Cube_002 Mesh
} //End of Cube
} //End of Root Frame
And here are 2 attempts at capturing the grammar:
With this one, I get: ../XGrammar.y: conflicts: 1 shift/reduce
//----------------------------------------------------
Start
: KEYWORD ID '{' EntityDeclaration '}' END
;
//----------------------------------------------------
EntityDeclaration
: KEYWORD ID '{'AttributeDeclaration'}' EntityDeclaration
| KEYWORD ID '{'AttributeDeclaration'}'
;
//----------------------------------------------------
AttributeDeclaration
: KEYWORD '{' Statement AttributeDeclaration '}' AttributeDeclaration
| KEYWORD '{' Statement '}' AttributeDeclaration
| KEYWORD '{' Statement AttributeDeclaration '}'
| KEYWORD '{' Statement '}'
;
//----------------------------------------------------
Statement
: ExpressionList ';'
| ExpressionList ';' Statement
;
//----------------------------------------------------
ExpressionList
: Expression
| Expression ',' ExpressionList
;
//----------------------------------------------------
Expression
: INTEGER
| REAL
| VecFType
;
//----------------------------------------------------
VecFType
: Vec3FType
| Vec4FType
;
//----------------------------------------------------
Vec3FType
: REAL ';' REAL ';' REAL ';'
;
//----------------------------------------------------
Vec4FType
: REAL ';' REAL ';' REAL ';' REAL ';'
;
And with the next, I get:
../XGrammar.y: conflicts: 1 shift/reduce, 2 reduce/reduce
../XGrammar.y:56.3-9: warning: rule never reduced because of conflicts: Expression: Element
//----------------------------------------------------
Start
: KEYWORD ID '{' EntityDeclaration '}' END
;
//----------------------------------------------------
EntityDeclaration
: KEYWORD ID '{'AttributeDeclaration'}' EntityDeclaration
| KEYWORD ID '{'AttributeDeclaration'}'
;
//----------------------------------------------------
AttributeDeclaration
: KEYWORD '{' Statement AttributeDeclaration '}' AttributeDeclaration
| KEYWORD '{' Statement '}' AttributeDeclaration
| KEYWORD '{' Statement AttributeDeclaration '}'
| KEYWORD '{' Statement '}'
;
//----------------------------------------------------
Statement
: Expression ';'
| Expression ';' Statement
;
//----------------------------------------------------
Expression
: Container
| Element
;
//----------------------------------------------------
Container
: VecFType
| ArrayType
;
//----------------------------------------------------
ArrayType
: ';'
| Element ArrayElement
;
//----------------------------------------------------
ArrayElement
: ArrayType
| ','Element ArrayElement
;
//----------------------------------------------------
Element
: INTEGER
| REAL
;
//----------------------------------------------------
VecFType
: ';'
| Element VecElement
;
VecElement
: VecFType
| ';'Element VecElement
;
I’m super rusty in grammars (it’s been a while since college), but I suspect it’s got to do with the wonderful way in which Microsoft decided to pull off the concept of “containers” and mix it in with commas and semicolons: http://msdn.microsoft.com/en-us/library/windows/desktop/bb206298(v=vs.85).aspx
Basically, they say, a container is sort of “implicit” and the end of it is marked by a semicolon. Which also happens to be how they mark the end of an array, and which also happens to be the way the mark (apparently) individual elements of an implied container when in the context of an array… :S So The whole thing is a shitshow.
A first though was, this x format is just not LR(1).
On a second though, and after re-reading the microsoft documentation, I though that whole vector vs array thing is bs… (my interpretation of the difference between these types through my grammar). So I thought, I just have to create a production for “Container” instead. Something along the lines of: there’s 2 types of declarations inside the braced sections, freestanding / inscope / local / whathaveyou ones and contained ones. Contained ones are the same type as the other only that take an extra semicolon at the end so as to “close” the container. But if you take a look at the Microsoft docs you’ll see it’s not necessarily quite like that.
A third approach, for which I get:
../XGrammar.y: conflicts: 4 shift/reduce, 1 reduce/reduce
../XGrammar.y:61.3-29: warning: rule never reduced because of conflicts: ContainerType: ',' Statement ContainerType
//----------------------------------------------------
Start
: KEYWORD ID '{' Entity '}' END
;
//----------------------------------------------------
Entity
: KEYWORD ID '{'Attribute'}' Entity
| KEYWORD ID '{'Attribute'}'
;
//----------------------------------------------------
Attribute
: KEYWORD '{' Statement Attribute '}' Attribute
| KEYWORD '{' Statement '}' Attribute
| KEYWORD '{' Statement Attribute '}'
| KEYWORD '{' Statement '}'
;
//----------------------------------------------------
Statement
: Declaration ';'
| Declaration ';' Statement
;
Declaration
: ElementType
| ContainerType
;
ContainerType
: Statement ',' Statement
| ',' Statement ContainerType
;
//----------------------------------------------------
ElementType
: INTEGER
| REAL
;
//----------------------------------------------------
So I’m lost. Any help appreciated.
Based on Chris Dodd’s implicit answer, I am concluding that the key problem is that, as stated, the X format cannot be captured in a context-free language. This is because of the weird-ass (sorry, arbitrary) usage of semicolons to mean (in a non mutually exclusive way)
Just makes it where one can’t come up with context-free production rules.
Bottom line: parse everything from a declarative point of view and assign semantics depending on the current token value when completing a derivation.
My new “grammar” is then:
Note that I still found a way to keep the commas along with one lambda-production so as to come up with a half-arsed notion of “lists” (which is useful for the final semantics I give to the data being parsed).