I don’t understand this explanation which says if n is the number of elements in the hash table and m is the total number of buckets then hashtables have constant access time in average only if n is proportional to theta(n). Why does it have to be proportional ?
Share
well actually m should be proportional to n. Otherwise you could, for example, have just 1 bucket and it would be just like an unsorted set.
To be more precise, if m is proportional to n, i.e. m = c * n, then the number of items in each bucket will be n/m = 1/c which is a constant. Going to any bucket is an O(1) operation (just compute the hash code) and then the search through the bucket is constant order (you could just do a linear search through the items in the bucket which would be a constant).
Thus the order of the algorithm is O(1), if m = c * n.
To take a converse example, suppose we had a fixed size table of size tableSize. Then the expected number of items in each bucket is n/tableSize which is a linear function of n. Any kind of search through the bucket is at best O(log(n)) for a tree (I’m assuming you don’t stick another hash table inside the bucket or we then have the same argument over that hash table), so it would not be O(1) in this case.