I’m writing a (simple) compiler in Scala and have made the tokenizer iterable and now need to write the parser. The plan is to use a recursive-descent strategy and so the parser is going to be split up into a number of methods, each of which calls (some of) the others.
I assume it’s going to be necessary/preferable to maintain the state of the tokenizer iterator and share it among the various methods. Is this the case? How should I go about it? If it’s not the case, what are the alternatives?
If you have to maintain the state of the iterator, don’t use an iterator! Iterators are for when you can destroy your state as you go.
You might be able to get away with using a stream. Streams have a habit of not giving up their memory when they ought to because of references persisting where you don’t want them (but where you can tell they exist if you think about it). So if you started with an iterator, you could .toStream it and pass the substreams in, and then pass on the stream for further processing. But you’d have to be very careful about not keeping a reference to the head of the stream if you wanted to avoid keeping everything in memory.
Another way to go is to just dump everything into a vector or array and keep the whole problem in memory; you can then drop the irrelevant parts (or advance the index) as you proceed.
Finally, if you’re absolutely positive that you don’t need any backtracking, then you can just use the iterator as it is without worrying about “maintaining the state”. That is, when you get back from the sub-method, you will already have consumed exactly the right tokens and no more, and will be free to keep parsing. For this to work without at least a one-element “next token that I didn’t consume” on the return value, you need to be able to predict where the last token is (e.g. a list of unbounded length would have to end with a token that was part of the list, so
{1,2,3}could be a list (if you go into list processing when you see{and drop out when you hit}), but not1,2,3 + 7(because you’d consume+before you realized that the list was over)).