Sorry for the title, I really didn’t know how to say this…
I often have a string that needs to be cut after X characters, my problem is that this string often contains special characters like : & egrave ;
So, I’m wondering, is their a way to know in php, without transforming my string, if when I am cutting my string, I am in the middle of a special char.
Example
This is my string with a special char : è - and I want it to cut in the middle of the "è" but still keeping the string intact
so right now my result with a sub string would be :
This is my string with a special char : &egra
but I want to have something like this :
This is my string with a special char : è
The best thing to do here is store your string as UTF-8 without any html entities, and use the
mb_*family of functions withutf8as the encoding.But, if your string is ASCII or iso-8859-1/win1252, you can use the special
HTML-ENTITIESencoding of the mb_string library:However, if your underlying string is UTF-8 or some other multibyte encoding, using
HTML-ENTITIESis not safe! This is becauseHTML-ENTITIESreally means “win1252 with high-bit characters as html entities”. This is an example of where this can go wrong:When your string is in a multibyte encoding, you must instead convert all html entities to a common encoding before you split. E.g.: