Working on an assignment for a self-study course that I’m taking in cryptography (I’m receiving no credit for this class). I need to compute hash values on a large file where the hash is done block by block. The thing that I am stumped on at the moment is how to break up the file into these blocks? I’m using python, which I’m very new to.
f = open('myfile', 'rb')
BLOCK_SIZE = 1024
m = Crypto.Hash.SHA256.new()
thisHash = ""
blocks = os.path.getsize('myfile') / BLOCK_SIZE #ignore partial last block for now
for i in Range(blocks):
b = f.read(BLOCK_SIZE)
thisHash = m.update(b.encode())
f.seek(block_size, os.SEEK_CUR)
Am I approaching this correctly? The code seems to run up until the m.update(b.encode()) line executes. I don’t know if I am way off base or what to do to make this work. Any advice is appreciated. Thanks!
(note: as you might notice, this code doesn’t really produce anything at the moment – I’m just getting some of the scaffolding set up)
You’ll have to do a few things to make this example work correctly. Here are some points:
Crypto.Hash.SHA256.SHA256Hash.update()(you invoke it asm.update()) has no return value. To pull a human-readable hash out of the object,.update()it a bunch of times and then call.hexdigest().update()function. Just pass the string containing the data block.file.read(). You don’t need a separate.seek()operation..read()will return an empty string if you’ve hit EOF already. This is totally fine. Feel free just to pull in that partial block.block_sizeis not the same variable asBLOCK_SIZE.Making these few minor adjustments, and assuming you have all the right imports, you’ll be on the right track.