I’m having a problem correctly hashing my objects. Consider the following code:
class Foo:
def __init__(self, bar):
self.keys = list(bar.keys())
self.values = list(bar.values())
def __str__(self):
return ', '.join( '%s: %s' % z for z in zip(self.keys, self.values))
def __hash__(self):
return hash(str(self))
if __name__ == '__main__':
result = set()
d = { 1: 2, 3: 4, 5: 6, 7: 8 }
for i in range(10):
result.add(Foo(d))
for r in result:
print r, hash(r)
I expect the result set to contain a single element, since all the added Foo objects have the same contents, and therefore the same hash.
However, this is the result:
misha@misha-K42Jr:~/Desktop/stackoverflow$ python hashproblem.py
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
1: 2, 3: 4, 5: 6, 7: 8 2131119371379196338
What is the problem here? The hashes do look the same, so shouldn’t they be treated as duplicates by the built-in set object? Why does the set contain duplicates?
I’ve noticed that if I use str(Foo(d)) instead of Foo(d) when adding elements to the set, things work as expected. Why does it matter?
Python version is:
misha@misha-K42Jr:~/Desktop/stackoverflow$ python --version
Python 2.6.6
Since the
__hash__method is only use for the internal hash-table, you need to redefine__eq__as well.Overriding only
__eq__is not correct either. If two object are equal, ie,a.__eq__(b) == True, then bothhash(a)andhash(b)must be equal as well.The default
__hash__method is: