I made a topic about the built-in python hash function: Old python hashing done left to right – why is it bad?
The previous topic was about why it was bad for encryption, because we have an application called Gruyere which is filled with security holes, and it uses the hash() to encrypt cookies.
# global cookie_secret; only use positive hash values
h_data = str(hash(cookie_secret + c_data) & 0x7FFFFFF)
c_data is a username; cookie_secret is salt (which is just ” by default)
I have implemented a more secure encryption method using md5 hashing with salt, but one excercise is to beat this old encryption and I still cannot understand how 🙁 I’ve read the string_hash code from python sourcecode but it’s not documented and I can’t figure it out.
EDIT: The idea is to write a program which can create a valid cookie any valid user, so I think I need to find out cookie_secret somehow
Zack described the answer already in your last question: It’s easy to find a collision.
Let’s say you save
hash("pwd")in the database (that you actually do something different doesn’t matter. Now, if you enter"pwd"in the site, you can enter. But how is this checked? Again, the hash of"pwd"is token, and compared to the value in the database. But what if there is a second string, say"hello", andhash("hello") == hash("pwd")? Then you could also use"hello"as password. So to beat the encryption, you don’t need to find “pwd”, you just need any string which has the same hash-value. You can just search for such a string brute-force (and I guess you can do some optimizations based on the knowledge of the source ofhash)