Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3695368
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T04:36:34+00:00 2026-05-19T04:36:34+00:00

I have an array of hex numbers, and I need to go over other

  • 0

I have an array of hex numbers, and I need to go over other numbers and check if they appear in that array. Right now i’m using a foreach loop that goes over the entire array each time. Is there a way to make it faster by sorting the array at first, and then implementing binary search on it.

The code at the moment:

sub is_bad_str{
  my ($str, @keys) = @_;
  my $flag = 0;
  my ($key, $hex_num);
        if ($str =~ m/14'h([0-9a-f][0-9a-f][0-9a-f][0-9a-f])/;){ #'# fixes bad highlighting
  $hex_num = $1;
      }
  if (defined $hex_num){
    foreach $key (@keys){
        if ($hex_num =~ /\Q$key\E/i){
            $flag = 1;
            last;
        }
    }
  }
  if (($flag == 0) && (defined $hex_num)){
    return 1;#Bad str
  }else{
    return 0;#Good str
      }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T04:36:34+00:00Added an answer on May 19, 2026 at 4:36 am

    There are four strategies to doing efficient bulk search in a set of data in Perl.

    The full analysis is outlined below, but in summary, the best performance on average random data set with a significant # of searches is of course offered by hash lookups, followed by much-worse BST.


    1. Binary (half-interval) search of an array.

      This is obviously a standard algorithmic approach.

      Performance Costs:

      • O(N * log N) for initial sorting.
      • O(N) on average for inserting/removing data in the list once sorted. Perl arrays are NOT linked lists so it’s not an O(log N).
      • O(log N) for each search.

      Implementation: the algorithm is so simple that DIY is easy. As usual, CPAN modules exist and should probably be used instead of DIY anyway: Search::Binary .


    2. Binary Search Trees (BSTs)

      Performance Costs:

      • O(N * log N) for initial sorting.
      • O(log N) on average for inserting/removing data in the list once sorted
      • O(log N) for each search.

      Implementation: several flavors exist on CPAN: Tree::Binary::Search, Tree::Treap, Tree::RedBlack. The latter two have better average performance and smaller performance fluctuations, algorithmically.

      Comparison: If the data WILL change, you must use BST to avoid re-sort costs. If your data is random and never changes once sorted, you can use simple binary search over BST but BSTs can be tuned better if every last ounce of performance matter (BST can be optimized for faster average searching than list binary search if you know your lookup costs based on data distribution – see Wiki’s “Optimal binary search trees” section or if your data distribution favors one of the special trees like Treap or Red/Black).


    3. Abbreviated (short circuited) scan lookups.

      These are linear scan searches on un-sorted list which STOP searching once the item is found.

      Performance: O(N) per search for random data, but a faster O(N) (say, N/2) than a full-list search like grep. No extra costs.

      Implementation: There are 3 ways to do them in Perl:

      • Smart match operator (~~). The problem is that it is ONLY available in Perl 5.10 and above.
      • Your own loop that does next; once found.
      • List::MoreUtils module’s first() subroutine.

      Comparison:

      • First, between the 3 implementations above, List::MoreUtils::first is faster than the DIY loop because it’s implemented in XS; so it should be used in Perl versions before 5.10. Smart match is probably just as fast, though I would benchmark the two before you pick one or the other in Perl 5.10+.

      • Second, comparing short circuited search to other methods, there are only 3 edge cases where it should be used:

        A. Memory constraints. Both sorted list search, BST and hash lookups have memory footprint of at the very least 2*N. If you face a memory constraint (given your list size) severe enough that N vs 2*N memory becomes a non-negotiable cost barrier, then you use short circuited search and pay the performance penalty in time.
        This is especially true when you’re processing a large data set in batches/line-by-line, so as to avoid storing the whole thing in memory in the first place.

        B. If your data is distributed and pre-sorted in such a way that a VAST majority of the searches would find their quarry in the very beginning of the list. If that’s the case, it MAY outperform the fancier methods like BST of binary search despite their obviously faster O(log N) average searches. It’d still be hard to outperform hash lookups, but more on that later.

        C. Short circuited search is superior to BSTs or sorted list searches if the # of searches performed is fairly small compared to the list size, in which case the initial sorting cost of the first 2 methods (O(N log N)) would outweigh the search savings. Since the savings of BST vs. linear search are O(M * N) where M is # of searches, it follows that # of searches M must be less than O(log N) to realize the savings on average, but may be more in the second edge case where average scan cost is less than O(N) due to data distribution.


    4. Hash lookups

      Performance Costs:

      • O(N) + epsilon for initial hash creation (It’s not strictly speaking O(N) for a random large data set, due to possible key collision. I don’t know enough about Perl’s hash implementation to clarify this other than state that it CAN be come a concern for any hashmaps.
      • O(1) on average for inserting/removing data in the list once sorted (+ same epsilon as initial hash creation due to key collisions).
      • O(1) for each search (plus same epsilon).

      Implementation:

      my %lookup = map { $_ => 1 } @list; 
      my @lookup2{ @list } = (); # Alternative method of creating a hash
      sub find { return exists $lookup{$_[0]; }
      

      Comparison:

      • First, the same logic applies to comparing short circuited search with hash lookups as with BST vs short-circuited search. E.g., you should ALMOST always use hashmaps over linear search, with the exception of the same two edge cases (data set is such that average list scan becomes O(1) instead of O(N) and the ratio of # of searches to the data set size makes the aggregate cost of searches less than O(N) needed for hash creation).

      • Second, hashmaps ON AVERAGE are obviously much faster than BSTs or binary list search. The only possible edge case here is that you somehow stumble into a data set that manages to overload the buckets and turn that extra “epsilon” cost into a large enough weight that it starts under-performing O(log N).

        I strongly doubt that it is even remotely possible, but again, don’t know enough about Perl’s implementation of hashmaps to prove that it would never happen for even the worst data set.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a byte array filled with hex numbers and printing it the easy
I have an array of hex codes that translate into assembly instructions and I
I have an array of string-encoded hex values. I need to convert those strings
I have a byte array that has hex values and I initially put those
I have test array of hex(radix 16) numbers var numbers = [01,02,a1]; and I
I have an array of hex characters that is > 8000 characters, and I
I have an array of RGB hex colors. I would like to find a
So to start, I have an array of XML files. These files need to
I have to convert a normal string array with the value {STX,ETX} into hex
I have a array of 4 chars and i need his ascii value in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.