I have the grammar file alexa_scrape.tt:
grammar AlexaScrape
rule document
category_listing*
end
rule category_listing
category_line url_line*
end
rule category_line
category "\n"
end
rule category
("/" [^/]+)+
end
rule url_line
[0-9]+ ". " url "\n"
end
rule url
[^\n]*
end
end
I have a ruby file which attempts to make use of it:
#!/usr/bin/env ruby -I .
require 'rubygems'
require 'polyglot'
require 'treetop'
require 'alexa_scrape.tt'
parser = AlexaScrapeParser.new
p( parser.parse("") || parser.failure_reason )
p( parser.parse("/x\n") || parser.failure_reason )
But I’m not getting the results I expected:
SyntaxNode offset=0, ""
"Expected one of /, \n at line 2, column 1 (byte 4) after /x\n"
It parses the empty string properly (as the trivial match for document, zero category_listings), but fails to parse "/x\n" (as the document containing a single category_listing that itself has zero url_lines).
What am I doing wrong?
It looks like the regex in
categoryis advancing through the white space needed to matchcategory_line… do this:(And, wow, a Treetop question. This is number 47 in the history of SO and its 4 million total questions. One in 87,000 SO questions are tagged Treetop).