The result I’m getting is that files of the same type are returning the same md5 hash value. For example two different jpgs are giving me the same result. However, a jpg vs a apk are giving different results.
Here is my code…
public static String checkHashURL(String input) {
try {
MessageDigest md = MessageDigest.getInstance("MD5");
InputStream is = new URL(input).openStream();
try {
is = new DigestInputStream(is, md);
int b;
while ((b = is.read()) > 0) {
;
}
} finally {
is.close();
}
byte[] digest = md.digest();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < digest.length; i++) {
sb.append(
Integer.toString((digest[i] & 0xff) + 0x100, 16).substring(
1));
}
return sb.toString();
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
This is broken:
Your code will stop at the first byte of the stream which is 0. If the two files have the same values before the first 0 byte, you’ll fail. If you really want to call the byte-at-a-time version of
read, you want:The parameterless
InputStream.read()method returns -1 when it reaches the end of the stream.(There’s no need to assign a value to
b, as you’re not using it.)Better would be to read a buffer at a time:
This time the condition is valid, because
InputStream.read(byte[])would only ever return 0 if you pass in an empty buffer. Otherwise, it will try to read at least one byte, returning the length of data read or -1 if the end of the stream has been reached.