Why does List<T> increase its capacity by a factor of 2?
private void EnsureCapacity(int min)
{
if (this._items.Length < min)
{
int num = (this._items.Length == 0) ? 4 : (this._items.Length * 2);
if (num < min)
{
num = min;
}
this.Capacity = num;
}
}
Why does Dictionary<K,V> use prime numbers as capacity?
private void Resize()
{
int prime = HashHelpers.GetPrime(this.count * 2);
int[] numArray = new int[prime];
for (int i = 0; i < numArray.Length; i++)
{
numArray[i] = -1;
}
Entry<TKey, TValue>[] destinationArray = new Entry<TKey, TValue>[prime];
Array.Copy(this.entries, 0, destinationArray, 0, this.count);
for (int j = 0; j < this.count; j++)
{
int index = destinationArray[j].hashCode % prime;
destinationArray[j].next = numArray[index];
numArray[index] = j;
}
this.buckets = numArray;
this.entries = destinationArray;
}
Why doesn’t it also just multiply by 2? Both are dealing with finding continues memory location…correct?
It’s common to use prime numbers for hash table sizes because it reduces the probability of collisions.
Hash tables typically use the modulo operation to find the bucket where an entry belongs, as you can see in your code:
Suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x…}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial to verify/derive this). Now you can do one of the following to avoid clustering:
Make sure that you don’t generate too many hashCodes that are multiples of another hashCode like in {x, 2x, 3x, 4x, 5x, 6x…}.But this may be kind of difficult if your hashTable is supposed to have millions of entries.
Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.
(from http://srinvis.blogspot.com/2006/07/hash-table-lengths-and-prime-numbers.html)
should return a prime number. Look at the definition of HashHelpers.GetPrime().