I’m trying to load into string the content of file saved on the dics. The file is .CS code, created in VisualStudio so I suppose it’s saved in UTF-8 coding. I’m doing this:
FILE *fConnect = _wfopen(connectFilePath, _T("r,ccs=UTF-8"));
if (!fConnect)
return;
fseek(fConnect, 0, SEEK_END);
lSize = ftell(fConnect);
rewind(fConnect);
LPTSTR lpContent = (LPTSTR)malloc(sizeof(TCHAR) * lSize + 1);
fread(lpContent, sizeof(TCHAR), lSize, fConnect);
But result is so strange – the first part (half of the string is content of .CS file), then strange symbols like 췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍췍 appear.
So I think I read the content in a wrong way. But how to do that properly?
Thank you so much and I’m looking to hear!
ftell(), fseek(), and fread() all operate on bytes, not on characters. In a Unicode environment, TCHAR is at least 2 bytes, so you are allocating and reading twice as much memory as you should be.
I have never seen fopen() or _wfopen() support a “ccs” attribute. You should use “rb” as the reading mode, read the raw bytes into memory, and then decode them once you have them all available, ie: