I pull some data from a HTML page with a list of products and for some text it looks like this:
Organicâ„¢
In the HTML page when I look at that same text I can see its supposed to read Organic with the TM (Trade Mark) symbol after it. Why does it look like the above!
My main question is How can I get rid of TM, @ and Copyright symbols so I am just left with a clean name of the product?
Thanks all for any help
Your page has the wrong character set declared (or no character set declared at all).
View the source HTML and see if in the
headsection there is a tag like<meta http-equiv="Content-Type" content="text/html; charset=utf-8">If there’s no such tag, or the tag is there but the
charsetbit is missing, you haven’t declared a character set. If the tag is there and thecharsetbit is present, the declared character set is wrong. Looking at the specific example you gave, it looks like the text might be in UTF-8 but is being displayed as latin-1.