I’m writing a small application in C that reads a simple text file and then outputs the lines one by one. The problem is that the text file contains special characters like Æ, Ø and Å among others. When I run the program in terminal the output for those characters are represented with a “?”.
Is there an easy fix?
First things first:
Ensure that your terminal can handle UTF-8 output. Having the correct locale setup and manipulating the locale data can automate alot of the file opening and conversion for you … depending on what you are doing.
Remember that the width of a code-point or character in UTF-8 is variable. This means you can’t just seek to a byte and begin reading like with ASCII … because you might land in the middle of a code point. Good libraries can do this in some cases.
Here is some code (not mine) that demonstrates some usage of UTF-8 file reading and wide character handling in C.
Links