Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6123871
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T16:04:04+00:00 2026-05-23T16:04:04+00:00

I’m creating a hash table in Perl, of an unknown size. The hash table

  • 0

I’m creating a hash table in Perl, of an unknown size.

The hash table maps a string to a reference to an array.

The main loop of my application adds 5-10 elements to the hash table in each iteration. As the hash table fills up, things start to slow down drastically. From observation, when there are ~50k keys in the hash table, adding keys slows by a magnitude of 20x.

I postulate that the hash table has become full, and collisions are occurring. I would like to ‘reserve’ the size of the hash table, but I’m unsure how.


The hash in question is hNgramsToWord.

For each word, the 1-len-grams of that word are added as keys, with a reference to an array of words which contain that ngram.

For example:

AddToNgramHash(“Hello”);

[h, e, l, l, o, he, el, ll, lo, hel, llo, hell, ello, hello ] are all added as keys, mapping to “hello”

sub AddToNgramHash($) {
    my $word = shift;
    my @aNgrams = MakeNgrams($word);
    foreach my $ngram (@aNgrams) {
       my @aWords;
       if(defined($hNgramsToWord{$ngram})) {
          @aWords = @{$hNgramsToWord{$ngram}};
       }
       push (@aWords, $word);
       $hNgramsToWord{$ngram} = \@aWords;
    }
    return scalar keys %hNgramsToWord;
}

sub MakeNgrams($) {
    my $word = shift;
    my $len = length($word);
    my @aNgrams;
    for(1..$len) {
       my $ngs = $_;
          for(0..$len-$ngs) {
           my $ngram = substr($word, $_, $ngs);
           push (@aNgrams, $ngram);
       }
    }
    return @aNgrams;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T16:04:04+00:00Added an answer on May 23, 2026 at 4:04 pm

    You can set the number of buckets for a hash like so:

    keys(%hash) = 128;
    

    The number will be rounded up to a power of two.

    That said, it is very unlikely that the slowdown you see is due to excess hash collisions, since Perl will dynamically expand the number of buckets as needed. And since 5.8.2, it will even detect pathological data that results in a given bucket being overused and reconfigure the hashing function for that hash.

    Show your code, and we will likely be able to help find the real problem.

    A demonstration of a large number of hash keys (don’t let it continue till you are out of memory…):

    use strict;
    use warnings;
    my $start = time();
    my %hash;
    $SIG{ALRM} = sub {
        alarm 1;
        printf(
            "%.0f keys/s; %d keys, %s buckets used\n",
            keys(%hash) / (time() - $start),
            scalar(keys(%hash)),
            scalar(%hash)
        );
    };
    alarm 1;
    $hash{rand()}++ while 1;
    

    Once there are a LOT of keys, you will notice a perceptible slowdown when it needs to expand the number of buckets, but it still maintains a pretty even pace.

    Looking at your code, the more words are loaded, the more work it has to do for each word.

    You can fix it by changing this:

       my @aWords;
       if(defined($hNgramsToWord{$ngram})) {
          @aWords = @{$hNgramsToWord{$ngram}};
       }
       push (@aWords, $word);
       $hNgramsToWord{$ngram} = \@aWords;
    

    to this:

       push @{ $hNgramsToWord{$ngram} }, $word;
    

    No need to copy the array twice. No need to check if the ngram already has an entry – it will autovivify an array reference for you.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I want to count how many characters a certain string has in PHP, but
I would like to count the length of a string with PHP. The string
For some reason, after submitting a string like this Jack’s Spindle from a text
I've got a string that has curly quotes in it. I'd like to replace
Specifically, suppose I start with the string string =hello \'i am \' me And
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
public static bool CheckLogin(string Username, string Password, bool AutoLogin) { bool LoginSuccessful; // Trim
Does anyone know how can I replace this 2 symbol below from the string

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.