I have a regex that blocks invalid characters in a string, but it’s also blocking chinese characters and i dont want it. Please help me with it. Below is the regex string that I am using.
String re = "[^\\x09\\x0A\\x0D\\x20-\\xD7FF\\xE000-\\xFFFD\\x10000-x10FFFF]";
Thanks in anticipation!
Since Java 7 you can make use of Unicode properties/scripts.
E.g. you can use the property
\p{L}to match a letter in any language. Or the script\p{IsHiragana}to match a character contained in Hiragana. You need to check what script is fitting your needs.See here on docs.Oracle.com for more details about regex and Unicode
It is also possible to match for the opposite, e.g.
\P{L}is matching every character, that is NOT a letter, or you just add\p{L}to your negated character class, instead of the ranges that should define letters.