I am translating some code from attoparsec to Parsec, because the parser needs to produce better error messages. The attoparsec code uses inClass (and notInClass) extensively. Is there a similar function for Parsec that lets me translate inClass-occurences mechanically? Hayoo and Hoogle didn’t offer any insight into the matter.
inClass :: String -> Char -> Bool
inClass "a-c'-)0-3-" is equivalent to \ x -> elem x "abc'()0123-", but the latter is inefficient and tedious to write for large ranges.
I will reimplement the function myself if nothing else is available.
There isn’t any such combinator; if there was, it would be in Text.Parsec.Char (which is where all the standard parser combinator functions that involve
Charare defined). You should be able to define it fairly easily.I don’t think you’ll be able to get the same performance advantages attoparsec does with its implementation, though; it relies on the internal
FastSettype, which only works with 8-bit characters. Of course, if you don’t need Unicode support, that might not be a problem, but the code forFastSetimplies you’ll get unpredictable results passing Chars greater than'\255', so if you want to reuse theFastSet-based solution, you’ll at least have to read the strings you’re parsing in binary mode. (You’ll also have to copy the implementation ofFastSetinto your program, as it’s not exported…)If your range strings are short, then a simple solution like this is likely to be pretty fast:
You could even try something like this, which should be at least as efficient as the above version (including when many calls to a single
inClass sare made), and additionally avoid the list traversal overhead:(taking care to move the recursion out of the lambda; I don’t know if GHC can/will do this itself.)