I am having a problem with utf-8 characters displaying correctly when being viewed with Notepad++.
I am viewing a list of geographic locations downloaded from:
I have already set encoding->Encode in utf8 .
An example of a display problem is the city “H̨alīmābād”. I see it as H,then a square character, then alīmābād. However if I copy and paste from Notepad++ to, say this text area, the city name shows up properly.
I’ve tried Googling around but most of the answers are to set the encoding to utf8 in the editor which, as I mentioned earlier, I have already done.
If anyone could suggest how to fix this issue I would very much appreciate it. Thanks much!
In your example, the first visible letter is encoded by the letter H followed by a combining ogonek; codepoint 48 followed by 328 . Your other accented letters are encoded by a single code-point, e.g. 12B for the “latin small letter I with macron”.
You might care to read the unicode FAQ on Characters and Combining Marks. The question with the example of an “X with circumflex by use of X with a combining circumflex” is equivalent to your situation. You’ll note that it says “Your problem is most likely a limitation of the layout engine and/or font you are using”. As such, the first thing you might want to try is seeing if you are able to view the file using a different font.