Suppose I write a class, but don’t define a __hash__ for it. Then __hash__(self) defaults to id(self) (self‘s memory address), according to the documentation.
However I don’t see in the documentation, how this value is being used.
So if my __hash__ was simply return 1, which would cause the hash of all instances of my class to be the same, they all get bucketed into the same underlying hash bucket (which I assume is implemented in C). However, this does not mean that the return value of __hash__ is being used as the key to bin elements in this underlying hash table.
So really, my question is: what happens to the value returned by __hash__? is it used as the key directly, or is its hash (or the result of some other computation performed on it) used as the key to the hash table?
In case it matters, I’m on python2.7
EDIT: To clarify, I’m not asking about how hash collisions are handled. In python, this seems to be done with linear chaining. Instead, I’m asking how the return value of __hash__ translates into the memory address (?) of the corresponding bucket.
Since Python’s hash tables have a size that is a power-of-two, the lower bits of the hash value determine the location in the hash table (or at least the location of the initial probe).
The sequence of probes into a table size of n is given by:
For example, the dictionary:
is stored as an array of size 8 with each entry in the form
(hash, key, value):The C source code for key insertion in Python’s dictionaries can be found here: http://hg.python.org/cpython/file/cd87afe18ff8/Objects/dictobject.c#l550