I am trying to adapt a php application to handle non-latin scripts (specifically: Japanese,

Question

0

Asked: May 23, 20262026-05-23T10:22:33+00:00 2026-05-23T10:22:33+00:00

I am trying to adapt a php application to handle non-latin scripts (specifically: Japanese,

0

I am trying to adapt a php application to handle non-latin scripts (specifically: Japanese, simplified Chinese and Arabic). The app’s data validation routines make frequent use of regular expressions to check input, but I am not sure how to adapt the \w character type to other languages without installing additional locales on the system (which I cannot rely on).

Previous developers to have worked on the app have simply added needed characters to the regexes as the number of languages we supported grew (you frequently see “[\wÀÁÂÃÄÅÆÇÈÉ… etc” in the code), but I can’t really do this for all the alphabets I need to support now.

Does anybody out there have some advice on how to tackle this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T10:22:33+00:00

See this comment on php.net: http://www.php.net/manual/en/regexp.reference.unicode.php#102756

for example:

//$string may only contain arabic letters
preg_match('@^\p{Arabic}+$@u',$string);

//$string may only contain cyrillic letters
preg_match('@^\p{Cyrillic}+$@u',$string);

//$string may contain word-characters and greek
preg_match('@^[\p{Greek}\w]+$@u',$str);

…and so on

demonstration: http://cecb.freephptest.com/

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to adapt a php application to handle non-latin scripts (specifically: Japanese,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply