in an original code (Drupal core module) previous developer commented out the string: if

Question

0

Asked: May 28, 20262026-05-28T15:15:30+00:00 2026-05-28T15:15:30+00:00

in an original code (Drupal core module) previous developer commented out the string: if

0

in an original code (Drupal core module) previous developer commented out the string:

if (preg_match('/[^\x{80}-\x{F7} a-z0-9@_.\'-]/i', $name)) {

and instead, added:

if (preg_match('/[^\x{80}-\x{F7} a-z0-9@_.\'-]/iu', $name)) {

Can you help me to understand what the difference between these two? What u modifier does? In php docs I found:

u (PCRE8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.

So I guess, previous developer had problems with interpreting special characters or something. I’m a bit puzzled, please advice on this.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T15:15:31+00:00

The modifier is needed to process utf-8 encoded input properly. A pattern like \xC1 should match the unicode character U+00C1 (À). When you encode Á in utf-8 you get \xC3\x81, so \xC1 doesn’t match. The “u” modifier makes the algorithm use utf-8 so it does match.

Basically, when you work with utf-8 encoded text this is what will happen:

<?php
var_dump(preg_match('/\xC1/u', 'Á'));
// => int(1), matches

var_dump(preg_match('/\xC1/', 'Á'));
// => int(0), doesn't match
?>

In your case the first regular expression [^\x80-\xF7] matches no (non-ascii) UTF-8 encoded text because of the way UTF-8 works. The second expression matches unicode characters outside of the range U+0080 – U+00F7, so it lets through all of cyrillic, greek, arab, hebrew, …

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

in an original code (Drupal core module) previous developer commented out the string: if

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply