I’m trying to make a parser for a simple functional language, a bit like Caml, but I seem to be stuck with the simplest things.
So I’d like to know if there are some more complete examples of parsec parsers, something that goes beyond “this is how you parse 2 + 3”. Especially function calls in terms and suchlike.
And I’ve read “Write you a Scheme”, but the syntax of scheme is quite simple and not really helping for learning.
The most problems I have is how to use try, <|> and choice properly, because I really don’t get why parsec never seems to parse a(6) as a function call using this parser:
expr = choice [number, call, ident]
number = liftM Number float <?> "Number"
ident = liftM Identifier identifier <?> "Identifier"
call = do
name <- identifier
args <- parens $ commaSep expr
return $ FuncCall name args
<?> "Function call"
EDIT Added some code for completion, though this is actually not the thing I asked:
AST.hs
module AST where
data AST
= Number Double
| Identifier String
| Operation BinOp AST AST
| FuncCall String [AST]
deriving (Show, Eq)
data BinOp = Plus | Minus | Mul | Div
deriving (Show, Eq, Enum)
Lexer.hs
module Lexer (
identifier, reserved, operator, reservedOp, charLiteral, stringLiteral,
natural, integer, float, naturalOrFloat, decimal, hexadecimal, octal,
symbol, lexeme, whiteSpace, parens, braces, angles, brackets, semi,
comma, colon, dot, semiSep, semiSep1, commaSep, commaSep1
) where
import Text.Parsec
import qualified Text.Parsec.Token as P
import Text.Parsec.Language (haskellStyle)
lexer = P.makeTokenParser haskellStyle
identifier = P.identifier lexer
reserved = P.reserved lexer
operator = P.operator lexer
reservedOp = P.reservedOp lexer
charLiteral = P.charLiteral lexer
stringLiteral = P.stringLiteral lexer
natural = P.natural lexer
integer = P.integer lexer
float = P.float lexer
naturalOrFloat = P.naturalOrFloat lexer
decimal = P.decimal lexer
hexadecimal = P.hexadecimal lexer
octal = P.octal lexer
symbol = P.symbol lexer
lexeme = P.lexeme lexer
whiteSpace = P.whiteSpace lexer
parens = P.parens lexer
braces = P.braces lexer
angles = P.angles lexer
brackets = P.brackets lexer
semi = P.semi lexer
comma = P.comma lexer
colon = P.colon lexer
dot = P.dot lexer
semiSep = P.semiSep lexer
semiSep1 = P.semiSep1 lexer
commaSep = P.commaSep lexer
commaSep1 = P.commaSep1 lexer
Parser.hs
module Parser where
import Control.Monad (liftM)
import Text.Parsec
import Text.Parsec.String (Parser)
import Lexer
import AST
expr = number <|> callOrIdent
number = liftM Number float <?> "Number"
callOrIdent = do
name <- identifier
liftM (FuncCall name) (parens $ commaSep expr) <|> return (Identifier name)
Hmm,
that part works for me after filling out the missing pieces.
Edit: I filled out the missing pieces by writing my own
floatparser, which could parse integer literals. Thefloatparser fromText.Parsec.Tokenon the other hand, only parses literals with a fraction part or an exponent, so it failed parsing the “6”.However,
when call fails after having parsed an identifier, that part of the input is consumed, hence ident isn’t tried, and the overall parse fails. You can a) make it
try callin the choice list ofexpr, so that call fails without consuming input, or b) write a parser callOrIdent to use inexpr, e.g.which avoids
tryand thus may perform better.