The hashlib Python module provides the following hash algorithms constructors: md5(), sha1(), sha224(), sha256(), sha384(), and sha512().
Assuming I don’t want to use md5, is there a big difference in using, say, sha1 instead of sha512? I want to use something like hashlib.shaXXX(hashString).hexdigest(), but as it’s just for caching, I’m not sure I need the (eventual) extra overhead of 512…
Does this overhead exist, and if so, how big is it?
Why not just benchmark it?
So on my machine,
hash512is twice as slow assha1. But as GregS said, why would you use secure hash for caching? Try the builtin hash algorithms which should be really fast and tuned:Or better yet, use the builtin Python dictionaries. Maybe you can tell us more about what you plan on caching.
EDIT:
I’m thinking that you are trying to achieve something like this:
What I was refferring to by “use the builtin Python dictinoaries” is that you can simplify the above:
In this way, Python takes care of the hashing so you don’t have to!
Regarding your particular problem, you could refer to Python hashable dicts in order to make a dictionary hashable. Then, all you’d need to do to cache the object is:
EDIT – Notes about Python3
Python 3.3 introduces hash randomization, which means that computed hashes might be different across different processes, so you should not rely on the computed hash, unless setting the
PYTHONHASHSEEDenvironment variable to 0.References:
– https://docs.python.org/3/reference/datamodel.html#object.hash
– https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED