I’m trying to write a parser with treetop to parse some latex commands into

Question

0

Asked: June 17, 20262026-06-17T16:31:48+00:00 2026-06-17T16:31:48+00:00

I’m trying to write a parser with treetop to parse some latex commands into

0

I’m trying to write a parser with treetop to parse some latex commands into HTML markup. With the following I get a deadspin in generated code. I’ve build the source code with tt and stepped through but it doesn’t really elucidate what the underlying issue is (it just spins in _nt_paragraph)

Test input: "\emph{hey} and some more text."

grammar Latex
  rule document
    (paragraph)* {
      def content
        [:document, elements.map { |e| e.content }]
      end
    }
  end

  # Example: There aren't the \emph{droids you're looking for} \n\n. 
  rule paragraph
    ( text / tag )* eop {
      def content
        [:paragraph, elements.map { |e| e.content } ]
      end
    }
  end

  rule text
    ( !( tag_start / eop) . )* {
      def content
        [:text, text_value ]
      end
    }
  end

  # Example: \tag{inner_text}
  rule tag
    "\\emph{" inner_text '}' {
      def content
        [:tag, inner_text.content]
      end
    }
  end 

  # Example: \emph{inner_text}
  rule inner_text
    ( !'}' . )* {
      def content
        [:inner_text, text_value]
      end
    }
  end

  # End of paragraph.
  rule eop
    newline 2.. {
      def content
        [:newline, text_value]
      end
    }
  end

  rule newline
    "\n"
  end

  # You know, what starts a tag
  rule tag_start
    "\\"
  end

end

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T16:31:49+00:00

For anyone curious, Clifford over at the treetop dev google group figured this out.

The problem was with paragraph and text.

Text is 0 or more characters, and there can be 0 or more texts in a paragraph, so what was happening was there was an infinite amount of 0 length characters before the first \n, causing the parser to dead spin. The fix was to adjust text to be:

( !( tag_start / eop) . )+

So that it must have at least one character to match.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to write a parser with treetop to parse some latex commands into

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply