I’m trying to make a configurable bloom filter. In the constructor you set the predicted necessary capacity of the filter (n), the desired error rate (p), and a list of hash functions (of size k).
According to Wikipedia, the following relation holds (m being the number of bits):
p = (1 - k * n / m) ** k
Since I get p, n and k as parameters, I need to solve for m; I get the following:
m = k * n / (1 - p ** (1 / k))
However, there are a few things that make me think I did something wrong. For starters, p ** (1 / k) will tend towards 1 for a large enough k, which means the whole fraction is ill defined (because you can conceivably divide by 0).
Another thing you may notice is that as p (the allowed maximum error rate) grows, so does m, which is totally backwards.
Where did I go wrong?
You did solve the equation correctly, however note that Wikipedia states:
This is very different from what you’ve stated:
So what you really want to start with is
I worked this out to be