Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7018293
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T23:01:42+00:00 2026-05-27T23:01:42+00:00

Working on wrapping my head around the CountSketch data structure and its associated algorithms.

  • 0

Working on wrapping my head around the CountSketch data structure and its associated algorithms. It seems to be a great tool for finding common elements in streaming data, and the additive nature of it makes for some fun properties with finding large changes in frequency, perhaps similar to what Twitter uses for trending topics.

The paper is a little difficult to understand for someone that has been away from more academic approaches for a while, and a previous post here did help some, for me at least it still left quite a few questions.

As I understand it, the Count Sketch structure is similar to a bloom filter. However the selection of hash functions has me confused. The structure is an N by M table with N hash functions with M possible values determining the “bucket” to alter, and another hash function s for each N that is “pairwise independent”

Are the hashes to be selected from a universal hashing family, say something of the h(x) = ((ax+b) % some_prime) % M?

And if so, where are the s hashes that return either +1 or -1 chosen from? And what is the reason for ever subtracting from one of the buckets?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T23:01:43+00:00Added an answer on May 27, 2026 at 11:01 pm

    They subtract from the buckets to make average effect of additions/subtractions caused by other occurrences to be 0. If half the time I add the count of ‘foo’, and half the time I subtract the count of ‘foo’, then in expectation, the count of ‘foo’ does not influence the estimate of the count for ‘bar’.

    Picking a universal hash function like you describe will indeed work, but it’s mostly important for the theory rather than the practice. Salting your favorite reasonable hash function will work too, you just can’t meaningfully write proofs based on the expected values using a few fixed hash functions.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Greetings! I'm working on wrapping my head around LINQ. If I had some XML
I'm having a hard time wrapping my head around this and need some help
I'm having a slight issue wrapping my head around on how to set up
I'm having a hard time wrapping my head around this. I'm loading up an
I'm having trouble wrapping my head around Events and their Handlers in general. I
I'm working on wrapping up the ugly innards of the FindFirstFile / FindNextFile loop
I am learning Obj-C but still occasionally have a difficult time wrapping my head
I am working on wrapping a large number of .h and .lib files from
I'm working on an input system, wrapping DirectInput and XInput. Currently XInput devices are
Working SQL The following code works as expected, returning two columns of data (a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.