Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4050048
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 20, 20262026-05-20T14:01:18+00:00 2026-05-20T14:01:18+00:00

Once again, I have a problem for which I would like to shave off

  • 0

Once again, I have a problem for which I would like to shave off nanoseconds. I have a small, constant array and I would like to search it to see if a given number is a member*.

Input: A 64-bit number n.

Output: True if n is in the array, false if n is not.

What are good techniques for making binary searches fast, given the possibility to optimize for the specific elements and their distribution.

Specifics

I have an array with around 136 members (though see below: there’s some flexibility) to search. The members are not equally distributed through the range: they cluster toward the beginning and end of the range. The input numbers can be assumed to the chosen with uniform probability. It’s probably worthwhile to take advantage of this irregularity.

Here’s a sample picture of the distribution for the 136-element array. Note that only 12 of the 136 elements are between 1% and 99% of the range; the balance are below 1% or over 99%.


(source: crg4.com)

I assume that branch misprediction will be the largest cost of any implementation. I’d be happy to be proved wrong.

Notes

*
Actually, I have two arrays. Actually actually, I have a choice of what arrays to use: efficiency suggests that the first should have perhaps 10-40 members, while the second can have no more than (exactly) 136 members. My problem gives real flexibility in selecting sizes, along with limited freedom to decide precisely which members to use. If a method performs better with certain sizes or restrictions, please mention this because I may be able to use it. All things equal, I’d prefer to have the second array as large as possible. For reasons unrelated to the binary search I may need to reduce the size of the second array to <= 135 or <= 66 (this is related to the difficulty of determining the input number, which depends on the array selected).

Here’s one of the possible arrays, if it helps in testing ideas. (This pretty well reveals my purpose…!) Don’t jump to unwarranted conclusions on the basis of the first few members, though.

0, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309, 3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141, 267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073, 4807526976, 7778742049, 12586269025, 20365011074, 32951280099, 53316291173, 86267571272, 139583862445, 225851433717, 365435296162, 591286729879, 956722026041, 1548008755920, 2504730781961, 4052739537881, 6557470319842, 10610209857723, 17167680177565, 27777890035288, 44945570212853, 72723460248141, 117669030460994, 190392490709135, 308061521170129, 498454011879264, 806515533049393, 1304969544928657, 2111485077978050, 3416454622906707, 5527939700884757, 8944394323791464, 14472334024676221, 23416728348467685, 37889062373143906, 61305790721611591, 99194853094755497, 160500643816367088, 259695496911122585, 420196140727489673, 679891637638612258, 1100087778366101931, 1779979416004714189, 2880067194370816120, 4660046610375530309, 7540113804746346429, 9320093220751060618, 9999984858389672876, 10259680355300795461, 10358875208395550958, 10396764270768694864, 10411236604793371085, 10416764544494255842, 10418876029572233892, 10419682545105283285, 10419990606626453414, 10420108275656914408, 10420153221227127261, 10420170388907304826, 10420176946377624668, 10420179451108406629, 10420180407830432670, 10420180773265728832, 10420180912849591277, 10420180966165882450, 10420180986530893524, 10420180994309635573, 10420180997280850646, 10420180998415753816, 10420180998849248253, 10420180999014828394, 10420180999078074380, 10420180999102232197, 10420180999111459662, 10420180999114984240, 10420180999116330509, 10420180999116844738, 10420180999117041156, 10420180999117116181, 10420180999117144838, 10420180999117155784, 10420180999117159965, 10420180999117161562, 10420180999117162172, 10420180999117162405, 10420180999117162494, 10420180999117162528, 10420180999117162541, 10420180999117162546, 10420180999117162548

I will initially run the program on a Phenom II x4, but optimizations for other architectures are welcome.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-20T14:01:18+00:00Added an answer on May 20, 2026 at 2:01 pm

    As a very simple possible optimization, create a 256-entry lookup table for the most significant 8 bits of your 64 bit value. Each row of the table stores indexes in the actual array of the lower and upper bounds of values with those most significant 8 bits. You only need to search this region of the array.

    If your array values were uniformly distributed, all the regions would be about the same length, and this wouldn’t provide much gain (if any), it’s not much different from an interpolation search. Since your values are so skewed, most of the 256 entries will point to very short regions (near the middle) which are fast to binary search, or even 0-sized regions. 2 or 3 entries at each end will point to much larger regions of the array, which then will take relatively longer to search through (almost as long as a binary search of the whole array). Since your inputs are uniformly distributed, the average time spent searching will be reduced, and hopefully this reduction is greater than the cost of the initial lookup. Your worst-case might well end up slower, though.

    To refine this, you might have a 2-level lookup table on 4 bits at a time. The first level either says “search between these indices”, or else “look up the next 4 significant bits in this second-level table”. The former is fine for middling values, where 16 times the value-range still corresponds to a very small index-range, and so is still quick to search. The latter would be for the ends of the range where the search space is larger. Total size of tables would be smaller, which may or may not give better performance due to better caching of less data. The tables themselves could be generated at runtime, or at compile-time if you’re willing to generate C code once the array values are known. You could even code the lookup table as a giant switch-statement from hell, just to see if it speeds things up or not.

    If you haven’t already, you should also benchmark an interpolation search rather than a simple binary chop once you start searching in the array.

    Note that I’ve worked to reduce the number of comparisons made in the binary search, rather than specifically the number of branch mispredictions. The two are sort of proportional anyway – you can’t avoid that each time you halve the possibilities in a binary search, you’ll get a misprediction in something like 50% of cases. If you really wanted to minimize mispredictions, then a linear search guarantees only one misprediction per lookup (the one that breaks the loop). That ain’t faster in general, but you could experiment to see whether there’s a size for the remaining array to be searched, below which you should switch to a linear search, perhaps unrolled, perhaps fully unrolled. There may be some other much cleverer hybrid linear/binary search that can be tuned for the relative cost of a successful vs. unsuccessful comparison, but if so I don’t know it.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Once again my silliness has struck. I would like to thank everyone who helped
I have a function which must return one dimensional associate array, like $user_info[$index]=value where
I have once again fleshed out Ruby, after two years of not touching it,
Once again a question about the garbage collector in actionscript-3: If I have a
I have a back up copy of data that I would like to protect
The Problem I have a CakePHP application in which I call a stored procedure:
Profiling my code, i see a lot of cache misses and would like to
I have a client-server system, both sides written by me, and I would like
tldr; at bottom. Ok, so once again an interesting problem and I'm looking for
I have a unique problem, which is proving difficult to solve using google. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.