Currently I am trying to read a UTF-16 encoded CSV file char by char, and convert each char into ascii so I can process it. I later plan to change my processed data back to UTF-16 but that is besides the point right now.
I know right off the bat I am doing this completely wrong, as I have never attempted anything like this before:
int main(void)
{
FILE *fp;
int ch;
if(!(fp = fopen("x.csv", "r"))) return 1;
while(ch != EOF)
{
ch = fgetc(fp);
ch = (wchar_t) ch;
ch = (char) ch;
printf("%c", ch);
}
fclose(fp);
return 0;
}
Wishfully thinking, I was hoping that that work by magic for some reason but that was not the case. How can I read a UTF-16 CSV file and convert it to ascii? My guess is since each utf-16 char is two bytes (i think?) I’m going to have to read two bytes at a time from the file into a variable of some datatype which I am not sure of. Then I guess I will have to check the bits of this variable to make sure it is valid ascii and convert it from there? I don’t know how I would do this though and any help would be great.
You should use
fgetwc. The below code should work in the presence of a byte-order mark, and an available locale nameden_US.UTF-16.