Here’s my current code:
return str.matches("^[A-Za-z\\-'. ]+");
I want it to include international letters. How do I do that in Java?
Thanks.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
It seems that you want is, to match all the alphabetic characters. Typically you would do that by using Posix
\p{Alpha}expression, extended by the punctuation you want also to permit. As Java Regular Expressions documentation says, it matches ASCII only.However, what documentation does not say clearly is, you can make this class work with Unicode characters. To do just that you need to turn Unicode character class matching on.
You can do this in one of two ways:
Patternobject passing theUNICODE_CHARACTER_CLASSconstant:Pattern p = Pattern.compile("^[p{Alpha}\\-'. ]+", UNICODE_CHARACTER_CLASS);(?U)embedded pattern flag:str.matches("^(?U)[\\p{Alpha}\\-'. ]+");Prove of concept:
The obvious result is:
If you think that all is correct, I have two additional points to make: