I’ve a fairly simple question about ParseKit and parsing timestamps… how do I go about forcing the symbolic-nature of a dot/period.
For example, if I am trying to parse 2008-01-25, I could use something like date = /\d{4}/ '-' /\d{2}/ '-' /\d{2}/. In fact, there is a date.grammar shipped with ParseKit that does exactly this (interestingly enough, though, the provided grammar doesn’t work in the DemoApp unless you add @symbolState='-';, but I digress…)
However, what do I do if I want to parse a date with dots in it… for example, 2008.01.25 or 2008-01-25-12.34.45. I’ve tried added '.' to the @symbolState directive but it just keeps getting ignored. Note that I am relying on the DemoApp to test my grammars at the moment… not sure if that makes any difference.
Any thoughts would be much appreciated.
Developer of ParseKit here.
First, thanks for the heads up on the bug in the
date.grammarfile. I have fixed it.As for your main question, I’m pretty sure what you are trying was not possible with ParseKit until now.
That is, ParseKit’s tokenizer (
PKTokenizer) was not able to produce only whole numberNumbertokens. Numbers were always tokenized as floating point which means it was impossible to parse input like3.14as three separate tokens3(Number).(Symbol)14(Number). Rather it would always be tokenized as3.14.Good news: I’ve added this capability with a new method:
which defaults to
YES.And I added a matching Tokenizer Directive which you can use in your ParseKit Grammars like:
NOTE you’ll need to checkout the latest HEAD of trunk on Google Code to see this feature.
So, here’s an example date grammar which does roughly what you were asking for with the new feature: