I’ve started learning FParsec. It has a very flexible way to parse numbers; I can provide a set of number formats I want to use:
type Number =
| Numeral of int
| Decimal of float
| Hexadecimal of int
| Binary of int
let numberFormat = NumberLiteralOptions.AllowFraction
||| NumberLiteralOptions.AllowHexadecimal
||| NumberLiteralOptions.AllowBinary
let pnumber =
numberLiteral numberFormat "number"
|>> fun num -> if num.IsHexadecimal then Hexadecimal (int num.String)
elif num.IsBinary then Binary (int num.String)
elif num.IsInteger then Numeral (int num.String)
else Decimal (float num.String)
However, the language I’m trying to parse is a bit strange. A number could be numeral (non-negative int), decimal (non-negative float), hexadecimal (with prefix #x) or binary (with prefix #b):
numeral: 0, 2
decimal: 0.2, 2.0
hexadecimal: #xA04, #x611ff
binary: #b100, #b001
Right now I have to do parsing twice by substituting # by 0 (if necessary) to make use of pnumber:
let number: Parser<_, unit> =
let isDotOrDigit c = isDigit c || c = '.'
let numOrDec = many1Satisfy2 isDigit isDotOrDigit
let hexOrBin = skipChar '#' >>. manyChars (letter <|> digit) |>> sprintf "0%s"
let str = spaces >>. numOrDec <|> hexOrBin
str |>> fun s -> match run pnumber s with
| Success(result, _, _) -> result
| Failure(errorMsg, _, _) -> failwith errorMsg
What is a better way of parsing in this case? Or how can I alter FParsec’s CharStream to be able to make conditional parsing easier?
Parsing numbers can be pretty messy if you want to generate good error messages and properly check for overflows.
The following is a simple FParsec implementation of your number parser:
Generating good error messages on overflows would complicate this implementation a bit, as you would ideally also need to backtrack after the error, so that the error position ends up at the start of the number literal (see the numberLiteral docs for an example).
A simple way to gracefully handle possible overflow exception is to use a little exception handling combinator like the following:
You could then write
I’m not sure what you meant to say with “alter FParsec’s
CharStreamto be able to make conditional parsing easier”, but the following sample demonstrates how you could write a low-level implementation that only uses theCharStreammethods directly.While this implementation parses hex and binary numbers without the help of system methods, it eventually delegates the parsing of decimal numbers to the Int32.TryParse and Double.TryParse methods.
As I said: it’s messy.