Here when I get file size using stat() it gives different output, why does it behave like this?
When “huffman.txt” contains a simple string like “Hi how are you” it gives file_size = 14. But when “huffman.txt” contains a string like “άSUä5Ñ®qøá”F” it gives file size = 30.
#include <sys/stat.h>
#include <stdio.h>
int main()
{
int size = 0;
FILE* original_fileptr = fopen("huffman.txt", "rb");
if (original_fileptr == NULL) {
printf("ERROR: fopen fail in %s at %d\n", __FUNCTION__, __LINE__);
return 1;
}
/*create variable of stat*/
struct stat stp = { 0 };
stat("huffman.txt", &stp);
/*determine the size of data which is in file*/
int filesize = stp.st_size;
printf("\nFile size is %d\n", filesize);
}
This has got to do with encoding.
Plain-text english characters are encoded in ASCII, where each character is one byte.
However, characters in non-plain text english are encoded in Unicode each being 2-byte.
Easiest way to see what is happening is to print each character using
You’ll understand why the file size is different.