I am interested in writing a syntax checker for a language. Basically what I want to do is make a cli tool that will take an input file, and then write errors that it finds. The language I would want to parse is basically similar to Turing, and it is rather ugly and sometimes a pain to work with. The only other syntax checker for it must be used
What language should I use? I figured I would write it in Ruby, but Python may be faster or have better parsing libraries.
What libraries should I use, in Ruby or Pearl? Which would be easier.
Is there a primer to read for defining a grammar? Such a task can become confusing, and I’m not sure how I would handle it.
If it were me, I would write it in Ruby, and worry about speed later. If the program is a runaway hit, I might add a native gem to speed up the slowest bit, but leave most of it in Ruby. If it becomes the most important program in the world, or if I had nothing else to do, I might rewrite it in C or C++ at that point, but not before.
And I would do all parsing using Treetop.
I might add that writing and optimizing a language parser directly in C is an interesting learning experience. You get roughly no string handling help, so you end up doing all the parsing, but you have a chance to do only the minimum amount of processing. It’s sort of the opposite of the Ruby experience. To get maximum speed you end up doing things like writing frond-ends for malloc, where multiple objects you know you never have to free get allocated permanently within a malloced block. Although it is typical to use yacc(1) with C/C++, you can certainly write a recursive-descent parser and have an even deeper learning experience.
Of course, having done all that already, I’m happy to stick with Ruby these days.