For an internationalised project, I have to validate the global syntax for a name (first, last) with Python. But the lack of unicode classes support is really maling things harder.
Is there any regex / library to do that ?
Examples:
Björn, Anne-Charlotte, توماس, 毛, or מיק must be accepted.
-Björn, Anne–Charlotte, Tom_ or entries like that should be rejected.
Is there any simple way to do that ?
Thanks.
Python does support unicode in regular expressions if you specify the re.UNICODE flag. You can probably use something like this:
Test code:
Result:
If you also want to disallow digits in names then change
[^\W_]to[^\W\d_]in both places.