Content people have been using Word and pasting things into the old unicode system. I’m now trying to go UTF8.
However, upon importing the data there are characters I cannot get rid of.
I have tried the following stackoverflow thread and none of the functions provided fix this string: http://snipplr.com/view.php?codeview&id=11171 / How to replace Microsoft-encoded quotes in PHP
String: Danâ??s back for more!!
In this kind of situation, I generally start with the string I have copy-pasted from word :
And, going byte-by-byte in it, I output the hexadecimal code of each byte :
Which gives an output such as this one :
Then, with a bit of guessing, luck, and trial-and-error, you’ll find out that :
âis a character that fits on two bytes :0xc3 0xa20xe2 0x80 0x99Hint : it’s easier when you don’t have two special characters following each other 😉
After that, it’s only a matter of using str_replace to replace the correct sequence of bytes by another character ; for example, to replace the special-quote by a normal one :
Will give you :