Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6960143
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T15:21:51+00:00 2026-05-27T15:21:51+00:00

I’m currently messing about with implementing interesting data structures in Ruby and have reached

  • 0

I’m currently messing about with implementing interesting data structures in Ruby and have reached a problem with testing functions that do not have a predictable output. I’m currently working on a Bloom Filter that I have included the implementation of below for completeness:

require "zlib"

class BloomFilter
  def initialize(size=100, hash_count=3)
    raise(ArgumentError, "negative or zero buffer size") if size <= 0
    raise(ArgumentError, "negative or zero hash count") if hash_count <= 0

    @size = size
    @hash_count = hash_count
    @buffer = Array.new(size, false)
  end

  def insert(element)
    hash(element).each { |i| @buffer[i] = true}
  end

  def maybe_include?(element)
    hash(element).map { |i| @buffer[i] }.inject(:&)
  end

  private :hash
  def hash(element)
    hashes = []

    1.upto(@hash_count) do |i|
      hashes << Zlib.crc32(element, i)
    end

    hashes.map { |h| h % @size }
  end
end

One of the problems with a Bloom Filter is that it has the possibility of returning false positives by falsely returning true for the inclusion of elements that have never been inserted into the filter.

Sometimes the filter behaves in a way that is easily testable:

b = BloomFilter.new(50, 5)

b.insert("hello")
puts b.maybe_include?("hello") # => true
puts b.maybe_include?("goodbye") # => false

However it sometimes bucks the trend and behaves in an unpredictable way. (I’ve reduced the size of the buffer here to find a conflict quickly.)

b = BloomFilter.new(5, 4)

b.insert("testing")
puts b.maybe_include?("testing") # => true
puts b.maybe_include?("not present") # => false
puts b.maybe_include?("false positive") # => true (oops)

So all of a sudden we have the string “false positive” providing a… false positive. My question is how can we test this?

  • If we choose values that just happen to work with our tests then I
    feel like the tests become far too fragile. For example, if we change
    the hashing function then we may still have a perfectly correct Bloom
    Filter that starts to fail some tests because of the values we chose
    to test the original implementation.

  • My second thought was to test that the filter behaves in a expected
    way by just checking that we get roughly the expected number of
    false
    positives

    from it by varying the number of hash functions and size of the
    internal buffer. While this approach may test the overall rough
    correctness of the filter I worry that it will not be able to catch
    bugs that cause it to report incorrect values for individual cases (such as false
    negatives).

Am I being too pessimistic about the effectiveness of the two methods of testing it above or am I missing a way to test classes such as the Bloom Filter which the output is unpredictable?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T15:21:52+00:00Added an answer on May 27, 2026 at 3:21 pm

    You’re right that choosing values that just happen to work is a bad idea. However, your second idea is not so bad.

    You should always be able to test that the values that should be in the bloom filter are there. You could randomly generate a number of strings, and check that a threshold amount are false positives. This way if you change the hash function your unit tests will still work and will still report that the filter has an acceptable false positive ratio.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am currently running into a problem where an element is coming back from
I have a French site that I want to parse, but am running into
I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I have some data like this: 1 2 3 4 5 9 2 6
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I have a jquery bug and I've been looking for hours now, I can't
this is what i have right now Drawing an RSS feed into the php,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.