I’m trying to write a function which compares the content of two files.
I want it to return 1 if files are the same, and 0 if different.
ch1 and ch2 works as a buffer, and I used fgets to get the content of my files.
I think there is something wrong with the eof pointer, but I’m not sure. FILE variables are given within the command line.
P.S. It works with small files with size under 64KB, but doesn’t work with larger files (700MB movies for example, or 5MB of .mp3 files).
Any ideas, how to work it out?
int compareFile(FILE* file_compared, FILE* file_checked)
{
bool diff = 0;
int N = 65536;
char* b1 = (char*) calloc (1, N+1);
char* b2 = (char*) calloc (1, N+1);
size_t s1, s2;
do {
s1 = fread(b1, 1, N, file_compared);
s2 = fread(b2, 1, N, file_checked);
if (s1 != s2 || memcmp(b1, b2, s1)) {
diff = 1;
break;
}
} while (!feof(file_compared) || !feof(file_checked));
free(b1);
free(b2);
if (diff) return 0;
else return 1;
}
EDIT: I’ve improved this function with the inclusion of your answers. But it’s only comparing first buffer only -> but with an exception -> I figured out that it stops reading the file until it reaches 1A character (attached file). How can we make it work?
EDIT2: Task solved (working code attached). Thanks to everyone for the help!
Since you’ve allocated your arrays on the stack, they are filled with random values … they aren’t zeroed out.
Secondly,
strcmpwill only compare to the first NULL value, which, if it’s a binary file, won’t necessarily be at the end of the file. Therefore you should really be usingmemcmpon your buffers. But again, this will give unpredictable results because of the fact that your buffers were allocated on the stack, so even if you compare to files that are the same, the end of the buffers past the EOF may not be the same, somemcmpwill still report false results (i.e., it will most likely report that the files are not the same when they are because of the random values at the end of the buffers past each respective file’s EOF).To get around this issue, you should really first measure the length of the file by first iterating through the file and seeing how long the file is in bytes, and then using
mallocorcallocto allocate the buffers you’re going to compare, and re-fill those buffers with the actual file’s contents. Then you should be able to make a valid comparison of the binary contents of each file. You’ll also be able to work with files larger than 64K at that point since you’re dynamically allocating the buffers at run-time.