I have an input field which is localized. I need to add a validation using a regex that it must take only alphabets and numbers. I could have used [a-z0-9] if I were using only English.
As of now, I am using the method Character.isLetterOrDigit(name.charAt(i)) (yes, I am iterating through each character) to filter out the alphabets present in various languages.
Are there any better ways of doing it? Any regex or other libraries available for this?
Since Java 7 you can use
Pattern.UNICODE_CHARACTER_CLASSwith out the option it will not recognize the word “Müller”, but using
Pattern.UNICODE_CHARACTER_CLASSSee here for more details
You can also have a look here for more Unicode information in Java 7.
and here on regular-expression.info an overview over the Unicode scripts, properties and blocks.
See here a famous answer from tchrist about the caveats of regex in Java, including an updated what has changed with Java 7 (of will be in Java 8)