I found a website that contains the string “don’t”. The obvious intent was the word “don’t”. I looked at the source expecting to see some character references, but didn’t (it just shows the literal string “don’t”. A Google search yielded nothing (expect lots of other sites that have the same problem!). Can anyone explain what’s happening here?
Edit: Here’s the meta tag that was used:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Would this not cause the page to be served up as Latin-1 in the HTTP header?
In your browser, switch the page encoding to “UTF-8”. You’re seeing a right single quote character, which is encoded by the octets
0xE2 0x80 0x99in UTF-8. In your charset, windows-1252, those 3 octets render as “’”. The page should be explicitly specifying UTF-8 as its charset either in the HTTP headers or in an HTML<meta>tag, but it probably isn’t.