I’m trying to create a file checker with an auto updater for my program.
The idea is that the user just downloads the launcher for my program, and that the launcher will then download all the required files on a few settings specified by the local user. and that it would also check whether files are:
1) Up-To-Date,
2) Corrupt,
3) Not found,
4) Requires Update.
2,3,4 would cause the file checker to add the file to the To_Download list, while if it’s 1 the file checker will mark it as valid and move on.
To do this I thought writing a checksum function, to check all files and compare the hashes against known healthy hashes (I’m using unmanaged SHA1). However if I then download a fresh instance of that file from the server, the checksum ends up completely different, even though I know the files are completely identical save for a different mod/creation time.
I need a reliable file-check that is quick, but not easy to by-pass. As well as giving me confidence that the files on the users computer are the same as the one on the server.
The reason I use Sha1 is that I read it has less ‘collisions’ and an collision is more ‘expensive’ to create versus the md5 alternative.
currently using
using (FileStream fs = new FileStream(FilePath, FileMode.Open, FileAccess.Read))
using (BinaryReader file = new BinaryReader(fs))
{
SHA1CryptoServiceProvider unmanaged = new SHA1CryptoServiceProvider();
byte[] retVal = unmanaged.ComputeHash(file.ReadBytes(Convert.ToInt32(fs.Length)));
file.Close();
StringBuilder stringBuilder = new StringBuilder();
if (retVal != null)
{
foreach (byte b in retVal)
{
stringBuilder.Append(HexStringTable[b]);
}
}
}
and the hexstringtable
private static readonly string[] HexStringTable = new string[]
{
"00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F",
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C", "1D", "1E", "1F",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "2A", "2B", "2C", "2D", "2E", "2F",
"30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D", "3E", "3F",
"40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "4A", "4B", "4C", "4D", "4E", "4F",
"50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E", "5F",
"60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "6A", "6B", "6C", "6D", "6E", "6F",
"70", "71", "72", "73", "74", "75", "76", "77", "78", "79", "7A", "7B", "7C", "7D", "7E", "7F",
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "8A", "8B", "8C", "8D", "8E", "8F",
"90", "91", "92", "93", "94", "95", "96", "97", "98", "99", "9A", "9B", "9C", "9D", "9E", "9F",
"A0", "A1", "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9", "AA", "AB", "AC", "AD", "AE", "AF",
"B0", "B1", "B2", "B3", "B4", "B5", "B6", "B7", "B8", "B9", "BA", "BB", "BC", "BD", "BE", "BF",
"C0", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "CA", "CB", "CC", "CD", "CE", "CF",
"D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "DA", "DB", "DC", "DD", "DE", "DF",
"E0", "E1", "E2", "E3", "E4", "E5", "E6", "E7", "E8", "E9", "EA", "EB", "EC", "ED", "EE", "EF",
"F0", "F1", "F2", "F3", "F4", "F5", "F6", "F7", "F8", "F9", "FA", "FB", "FC", "FD", "FE", "FF"
};
Any ideas why the file that is download fresh has a different hash than expected (as it’s identical?)
edit
I feel stupid for not comparing the 2 files in a hexeditor.. seems like the problem was 1 missing byte in those files, I have fixed that problem now. it currently takes 60-70 seconds to check all 7000 files, is there any possible to speed this up further?
Did you try comparing the files to see what has changed? If the SHA1 is different, the files are different (modtime has nothing to do with this.) Try diffing them or comparing them in a hex editor to see what is different.