Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7690885
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T20:29:54+00:00 2026-05-31T20:29:54+00:00

I have a custom closed-hashset/open-addressing (i.e. no linked lists) class. It’s very specific to

  • 0

I have a custom closed-hashset/open-addressing (i.e. no linked lists) class. It’s very specific to my needs – it’s not generic (only for positive long numbers), needs the amount of records to be inserted to be predefined, and doesn’t support remove – but it is meant to be as little space-consuming as possible.

Since it has so little functionality, it’s a really small and simple class. However for some reason, when i insert many entries, the number of collisions becomes much too high much too fast.

Some code (Java):

public class MyHashSet
{
    private long[] _entries;

    public MyHashSet(int numOfEntries)
    {
        int neededSize = (int)(numOfEntries / 0.65D);
        _entries = new long[neededSize];
    }

    public void add(long num)
    {
        int cell = ((Long) (num % _entries.length)).intValue();

        while (_entries[cell] != 0)
        {
            if (++cell >= _entries.length)  
                cell = 0;                   
        }

        _entries[cell] = num;
    }
...

I have a main which instansiates a MyHashSet object with 10 million as a parameter, then calls add() 10 million times with a different randomly-generated (yet positive) Long number. While on the normal Java HashSet this insertion takes about a second as a whole, it takes about 13 seconds for it to finish with MyHashSet.
I added a counter for collisions and indeed, the number of collisions is 3-6 billion – way more than expected (I’d guess about 30-40 million is to be expected).

Am I doing something wrong? Is there something wrong with the hashing itself? Why would there be so many collisions, and what can I do about it?

Thank you!

P.S.: The number 0.65 in the code represents that the table will only get 65% filled, which I know is supposed to be working well in closed hashsets. For this matter, even if i set it to 20%, the insertion still takes > 10 seconds..

— EDIT —

This is quite embaressing to admit, but my test code recreated the Random object (with System.currentTimeMillis() as a seed) in each iteration of the loop, rather than using the same one for the entire run..

After fixing it, it takes about 2-3 seconds for the insertion to be done with. This still seems too much in comparison – why would the default java HashSet only take a second to insert to, when it is more ‘complex’ than MyHashSet? I now get around 9 millions collisions only. I also tried taking the logging code off to see if it helps but it still won’t make for the difference. I’d appreciate any ideas, and sorry again for the confusion before.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T20:29:55+00:00Added an answer on May 31, 2026 at 8:29 pm

    The first thing I notice is gratuitous boxing on the line

    int cell = ((Long) (num % _entries.length)).intValue();
    

    which is much slower than

    int cell = (int) (num % _entries.length);
    

    (Note that num % _entries.length will always fit in an int, since _entries.length is itself an int.)

    Admittedly, Java’s HashSet would suffer from similar overhead anyway, but that’s at least one obvious thing to fix.

    Also, it’s probably to your advantage to make sure that the table size is a prime number. The simplest way to do this is BigInteger.valueOf((int)(numOfEntries / 0.65)).nextProbablePrime().intValue(), and since it’s a one-time cost it shouldn’t affect overall performance too badly.

    Alternately, Java’s HashSet uses power-of-2 hash table sizes, so it can use a mask (value & (_entries.length - 1), basically) rather than %, which is frequently more expensive.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a List<> (my custom class). I want to display a specific item
The log reports that the database or cursor was not closed. I basically have
I have a custom template for an expander that is close to the code
I'm using Adobe Air with a custom chrome and want to have a close
I have custom coded several enterprise applications for mid to large organizations to use
I have custom errors configured in my web.config, but IIS 6.0 is returning the
I have custom classes that I currently instantiate within App.xaml as resources. I want
i have custom cell with 2 buttons(the function of these buttons is just to
We have custom headers in the Silverlight DataGrid using the ContentTemplate. We've got a
Can you have custom client-side javascript Validation for standard ASP.NET Web Form Validators? For

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.