I want to remove all non-alphabetic character from a string. The problem is that I don’t know the letter range because it is UTF8 string.
It can be ENGLISH, ՀԱՅԵՐԵՆ, ქართული, УКРАЇНСЬКИЙ, РУССКИЙ
I usually do something like this:
$str = preg_replace('/[^a-zA-Z]/', '', $str);
or
$str = preg_replace('/[^\w]/u', '', $str);
but they both clear foreign characters.
Any ideas?
UPDATE: As for Unicode, RegExp will look like this
[^\p{L}\s]+(without replacing spaces)It will replace all non-alpha characters with UTF8 support.
\P{L}+– matches any non-letter symbols\p{P}+– removes punctuation onlyHere are some reference docs that can be helpful: