I am converting some functioning Haskell code that uses Parsec to instead use Attoparsec in the hope of getting better performance. I have made the changes and everything compiles but my parser does not work correctly.
I am parsing a file that consists of various record types, one per line. Each of my individual functions for parsing a record or comment works correctly but when I try to write a function to compile a sequence of records the parser always returns a partial result because it is expecting more input.
These are the two main variations that I’ve tried. Both have the same problem.
items :: Parser [Item]
items = sepBy (comment <|> recordType1 <|> recordType2) endOfLine
For this second one I changed the record/comment parsers to consume the end-of-line characters.
items :: Parser [Item]
items = manyTill (comment <|> recordType1 <|> recordType2) endOfInput
Is there anything wrong with my approach? Is there some other way to achieve what I am attempting?
I’ve run into this problem before and my understanding is that it’s caused by the way that
<|>works in the definition ofsepBy:This will only move to
pure []once(s *> scan)has failed, which won’t happen just because you’re at the end of the input.My solution has been just to call
feedwith anemptyByteString on theResultreturned byparse. This might be kind of a hack, but it also seems to be howattoparsec-iterateedeals with the issue:As far as I can tell this is the only reason that
attoparsec-iterateeworks here and plain oldparsedoesn’t.