I want to get strlen() of Shift-jis and Utf-8, then compare them.
A string could be mixed “ああ12345678sdfdszzz”. I tried to use strlen but it generates the different results. mb_strlen also doesn’t help because this is a mixed string.
For example:
ああ12345678 >> strlen() = 24 chars
ああああああああああああああああ >> strlen() = 48 chars
ああああああああああああああああああ >> strlen() = 54 chars
It seems to be there is no rule. So what is the best way to calculate strlen and compare them in multilanguage?
strlendoes only count the bytes and thus is only useful for single-byte character encodings; usemb_strlenfor multi-byte character encodings that can count the actual characters instead.