I’m a beginner with Haskell, so it might be very obvious what I’m doing wrong…
While trying to parse "1:1,2, 2:18, 3:100" into [(1,1), (1,2), (2,18), (3,100)] I got stuck on a lookahead.
To know if a number is a verse number it should look ahead for a colon, because then it is a chapter number instead.
The problem lies in the last function verseNr, it should parse+consume the number if not followed by a colon, otherwise fail without consuming anything (leaving the number to be parsed as a chapter number by refGroupByChapter).
Except for this issue it seems to work nicely 🙂
import Text.ParserCombinators.Parsec
main = do
case (parse refString "(unknown)" "1:1,2, 2:18, 3:100") of
Left e -> do putStr "parse error at "; print e
Right x -> print x -- expecting: [(1,1), (1,2), (2,18), (3,100)]
refString :: GenParser Char st [(Int, Int)]
refString = do
refGroups <- many refGroupByChapter
eof
return $ concat $ map flatten refGroups
where flatten (_, []) = []
flatten (c, v:vs) = (c, v):(flatten (c, vs))
refGroupByChapter :: GenParser Char st (Int, [Int])
refGroupByChapter = do
chapterNum <- many digit
char ':'
verseNums <- verseNrs
return ((read chapterNum :: Int), verseNums)
verseNrs :: GenParser Char st [Int]
verseNrs = do
first <- verseNr
remaining <- remainingVerseNrs
return (first:remaining)
where
remainingVerseNrs = do -- allow for spaces around the commas
(spaces >> oneOf "," >> spaces >> verseNrs) <|> (return [])
verseNr = try $ do
n <- many1 digit
notFollowedBy $ char ':' -- if followed by a ':' it's a chapter number
return (read n :: Int)
The trick for your particular problem would be to use the
sepByfamily of functions. You’re parsing lists of numbers separated by commas, which is exactly whatsepByis for. A list of verses has the following properties: there has to be at least one verse number and there is a trailing comma. Combining the two, we realize we need thesepEndBy1function. These functions are usually written in an infix position, so your code would look something like this:I don’t think you need to change anything else to get the code to work.
A couple of other minor style notes: you have some unnecessary parentheses. This isn’t important, it just annoys me personally. E.g. in
case ... ofyou do not need parens around the...bit. Also, you do not need the type signature when you useread–the compiler can infer the type. That is, sinceverseNrsreturns[Int], it’s completely clear both to the compiler and to me thatread nproduces anInt. There is no need to say it explicitly.