I’m trying to do a search for just letters and spaces (simple words) in

Question

0

Asked: June 17, 20262026-06-17T04:51:28+00:00 2026-06-17T04:51:28+00:00

I’m trying to do a search for just letters and spaces (simple words) in

0

I’m trying to do a search for just letters and spaces (simple words) in other languages, and if I find numbers or punctuation, throw a detection exception. When testing the regex i’ve written with UTF-8 numeric characters I found on wikipedia, my results always come back a match, and I’m baffled as to why unless it thinks all numbers are considered letters.

Here’s the characters I’ve tried:

5 or 伍
http://en.wikipedia.org/wiki/Chinese_numerals

5 or Є
http://en.wikipedia.org/wiki/Cyrillic_script

Here’s the code:

$were_bad_characters_found = preg_match('/[^\p{L}\p{Zs}]+/us',  $data);

The answer to the question it asks is always, no, there were no bad characters found.

It seemed, based on the docs, that this would work, and it in fact does work when I try to just run simple english numbers through it, but as soon as multilingual characters hit, it just rolls over on me. I have a number of variations on this for detecting different common scenarios, and all the utf8 regex code only seems to work well for english characters. Thoughts?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T04:51:29+00:00

Editorial Team

2026-06-17T04:51:29+00:00Added an answer on June 17, 2026 at 4:51 am

The characters you showed are letters.

U+4F0D 伍, Is not a digit and has non-numeric interpretations.
U+0404 Є Not a digit, but also not even close to having any kind numeric interpretation.

The properties of english digits in unicode make it a Digit and not a letter. In PHP you can use \p{Nd}, to match digits. But your regex is working fine.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to do a search for just letters and spaces (simple words) in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply