I am using the identifier parser from FParsec to parse the names of variables

Question

0

Asked: May 29, 20262026-05-29T11:09:22+00:00 2026-05-29T11:09:22+00:00

I am using the identifier parser from FParsec to parse the names of variables

0

I am using the identifier parser from FParsec to parse the names of variables and functions, which are normally a mixture of Unicode and ASCII characters. But sometimes I have escaped Unicode characters in the beginning (like \u03C0) or within the identifier (like swipe_board\u003A_b). I still can make them parseable using isAsciiIdStart and isAsciiIdContinue options, but I can’t define my own custom function for pre-processing before normalization. What could be a solution here?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T11:09:23+00:00

The identifier parser internally first parses a string and then passes it to an IdentifierValidator instance for validation. Since the C# IdentifierValidator class is publicly accessible (though not documented), you could easily adapt the identifier parser to your needs (by making the initial string parsing step also recognize the escapes).

The identifier parsing is a bit complicated due to support for UTF-16 surrogate pairs, normalization and the Unicode XID character category, which is not natively supported on .NET.
Maybe you only need to support ASCII or UCS-2 identifiers specified in term of character categories supported by CharUnicodeInfo.GetUnicodeCategory, in which case you could probably implement the parsing and validation in just one step using many1Satisfy2 or many1Chars2.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using the identifier parser from FParsec to parse the names of variables

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply