This is what I’ve found in the Kohana3 validator rules:
public static function digit($str, $utf8 = FALSE)
{
if ($utf8 === TRUE)
{
return (bool) preg_match('/^\pN++$/uD', $str);
}
else
{
return (is_int($str) AND $str >= 0) OR ctype_digit($str);
}
}
Can someone give an example when passing $utf8 parameter as true and false can give different results (to be precise – false positives for $utf8 == false)?
From what I remember – digits are ascii-safe characters and none of utf-8 characters may be confused with them.
PS: even more detailed – is it possible to fool this check and pass something that in UTF-8 would look not like a number, but would pass the check with $utf-8 == false
Just gave your second question part a bit more alcohol, and my conclusion is that you can’t hide an ASCII digit in a UTF-8 sequence. Digits must be
0x30..0x39or in the bitrange00110000..00110110..00111001.UTF-8 encodings include prefixes such as
And therefore a digit ASCII representation can’t match anywhere:
So it’s impossible that it would match in Latin-1/ASCII mode, but also have
\pNsatisfied in/umode. Ignoring invalid encodings of course.