I’m looking for a way to create a unique hash for images in python and php.
I thought about using md5 sums for the original file because they can be generated quickly, but when I update EXIF information (sometimes the timezone is off) it changes the sum and the hash changes.
Are there any other ways I can create a hash for these files that will not change when the EXIF info is updated? Efficiency is a concern, as I will be creating hashes for ~500k 30MB images.
Maybe there’s a way to create an md5 hash of the image, excluding the EXIF part (I believe it’s written at the beginning of the file?) Thanks in advance. Example code is appreciated.
In Python, you could use Image.tostring() to compute the md5 hash for the image data only, without the metadata.