I can’t figure out how to implement an Applicative instance for this parser:
newtype Parser m s a = Parser { getParser :: [s] -> m ([s], a) }
without assuming Monad m. I expected to only have to assume Applicative m, since the Functor instance only has to assume Functor m. I finally ended up with:
instance Functor m => Functor (Parser m s) where
fmap f (Parser g) = Parser (fmap (fmap f) . g)
instance Monad m => Applicative (Parser m s) where
pure a = Parser (\xs -> pure (xs, a))
Parser f <*> Parser x = Parser h
where
h xs = f xs >>= \(ys, f') ->
x ys >>= \(zs, x') ->
pure (zs, f' x')
How do I do this? I tried substituting in for >>= by hand, but always wound up getting stuck trying to reduce a join — which would also require Monad.
I also consulted Parsec, but even that wasn’t much help:
instance Applicative.Applicative (ParsecT s u m) where
pure = return
(<*>) = ap
My reasons for asking this question are purely self-educational.
Full marks for aiming to use Applicative as much as possible – it’s much cleaner.
Headline: Your parser can stay Applicative, but your collection of possible parses need to be stored in a Monad. Internal structure: uses a monad. External structure: is applicative.
You’re using
m ([s],a)to represent a bunch of possible parses. When you parse the next input, you want it to depend on what’s already been parsed, but you’re usingmbecause there’s potentially less than or more than one possible parse; you want to do\([s],a) -> ...and work with that to make a newm ([s],a). That process is called binding and uses>>=or equivalent, so your container is definitely a Monad, no escape.It’s not all that bad using a monad for your container – it’s just a container you’re keeping some stuff in after all. There’s a difference between using a monad internally and being a monad. Your parsers can be applicative whilst using a monad inside.
See What are the benefits of applicative parsing over monadic parsing?.
If your parsers are applicative, they’re simpler, so in theory you can do some optimisation when you combine them, by keeping static information about what they do instead of keeping their implementation. For example,
The second version is better than the first because it does no backtracking.
If you do a lot of this, it’s like when a regular expression is compiled before it’s run, creating a graph (finite state automaton) and simplifying it as much as possible and eliminating a whole load of inefficient backtracking.