Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7975203
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T08:35:14+00:00 2026-06-04T08:35:14+00:00

I currently have an algorithm that hashes a key and checks for it’s uniqueness

  • 0

I currently have an algorithm that hashes a key and checks for it’s uniqueness using map::count. How could this be optimized? I also forgot to mention that this is threaded.

int coll = 0;
map<long, bool> mymap;
#pragma omp parallel for
for (int i = 0; i < 256; i++)
  for (int j = 0; j < 256; j++)
    for (int k = 0; k < 256; k++)
    {
      string temp;
      temp = i;
      temp += j;
      temp += k;
      temp += temp;
      long myhash = hash(temp.c_str());

     if (mymap.count(myhash))
      {
        #pragma omp atomic
        coll++;
        cout << "Collision at " << i << " " << j << " " << k << endl;
      }
      else
      {
        #pragma omp critical
        mymap[myhash] = true;
      }
    }

cout << "Number of collisions: " << coll << endl;
cout << "Map size: " << mymap.size() << endl;

After much trial and error, here is the best version I could produce, generating 4294967296 keys in 82.5 seconds using 1GB of RAM.

#include <iostream>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/time.h>
#include <iomanip>
#include <omp.h>
#include <vector>
#include <fstream>
#include <ios>
#include <unistd.h>
using namespace std;

class Timer 
{
private:
  timeval startTime;

public:

  void start()
  {
    gettimeofday(&startTime, NULL);
  }

  double stop()
  {
    timeval endTime;
    long seconds, useconds;
    double duration;

    gettimeofday(&endTime, NULL);

    seconds  = endTime.tv_sec  - startTime.tv_sec;
    useconds = endTime.tv_usec - startTime.tv_usec;

    duration = seconds + useconds/1000000.0;

    return duration;
  }

  static void printTime(double duration)
  {
    cout << setprecision(10) << fixed << duration << " seconds" << endl;
  }
};

static inline long hash(const char* str)
{
  return (*(long*)str)>> 0;
}  
int coll;
vector<bool> test;

void process_mem_usage(double& vm_usage, double& resident_set)
{
  using std::ios_base;
  using std::ifstream;
  using std::string;

  vm_usage     = 0.0;
  resident_set = 0.0;

  // 'file' stat seems to give the most reliable results
  //
  ifstream stat_stream("/proc/self/stat",ios_base::in);

  // dummy vars for leading entries in stat that we don't care about
  //
  string pid, comm, state, ppid, pgrp, session, tty_nr;
  string tpgid, flags, minflt, cminflt, majflt, cmajflt;
  string utime, stime, cutime, cstime, priority, nice;
  string O, itrealvalue, starttime;

  // the two fields we want
  //
  unsigned long vsize;
  long rss;

  stat_stream >> pid >> comm >> state >> ppid >> pgrp >> session >> tty_nr
          >> tpgid >> flags >> minflt >> cminflt >> majflt >> cmajflt
          >> utime >> stime >> cutime >> cstime >> priority >> nice
          >> O >> itrealvalue >> starttime >> vsize >> rss; // don't care about the rest

  stat_stream.close();

  long page_size_kb = sysconf(_SC_PAGE_SIZE) / 1024; // in case x86-64 is configured to use 2MB pages
  vm_usage     = vsize / 1024.0;
  resident_set = rss * page_size_kb;
}
Timer timer;
void signal_handlerkill(int sig)
{
  cout << "Number of collisions: " << coll << endl;
  //cout << test.size() << endl;
  double vm, rss;
  process_mem_usage(vm, rss);
  vm /= 1024.0;
  rss /= 1024.0;
  cout << "VM: " << vm << "MB" << endl;
  timer.printTime(timer.stop());
  exit(1);
}

int main()
{
  signal(SIGINT, signal_handlerkill);
  timer = Timer();
  timer.start();
  coll = 0;

  for (long i = 0; i < 4294967296+1; i++)
  {
    test.push_back(0); //Set up the vector
  }

  #pragma omp parallel for 
  for (int i = 0; i < 256; i++)
    for (int j = 0; j < 256; j++)
      for (int k = 0; k < 256; k++)
        for (int l = 0; l < 256; l++)
        {
          const char temp[4] = {i, j, k, l};
          long myhash = (*(long*)temp);

          if(test.at(myhash))
          {
            #pragma omp atomic
            coll++;
          }
          else
          {
            test[myhash].flip();
          }
        }

  cout << "Number of collisions: " << coll << endl;
  double vm, rss;
  process_mem_usage(vm, rss);
  vm /= 1024.0;
  rss /= 1024.0;
  cout << "VM: " << vm << "MB" << endl;
  timer.printTime(timer.stop());

  return 0;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T08:35:16+00:00Added an answer on June 4, 2026 at 8:35 am

    In terms of space, you could use a set instead of a map, since the bool value is useless.

    Also, if you’re using C++11, an unordered_set will probably give better performance.

    Also,

    temp = i;
    temp += j;
    temp += k;
    temp += temp;
    

    probably has a bigger overhead than using a stringstream or even char arrays.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We have our own data streaming algorithm that include some metadata+records+fields values. Currently we
I currently have a query that looks like this: SELECT NON EMPTY ([Measures].[TOTAL]) ON
I'm currently programming a vocabulary algorithm that checks if a user has typed in
I currently have an algorithm that operates on an adjacency matrix of size n
I currently have an algorithm that generates a point grid. It takes input from
I currently have links w/ class=ajax that I want to retrieve the element with
I am currently trying to write an algorithm that determines how many bits are
I currently have a simple application that needs to click on certain coordinates and
I have an algorithm which currently allocates a very large array of doubles, which
I have been trying to get a working algorithm that detects intersection between a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.