Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3799256
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 19, 20262026-05-19T13:42:14+00:00 2026-05-19T13:42:14+00:00

As a mostly front-end developer, this is in the realm of computer science that

  • 0

As a mostly front-end developer, this is in the realm of computer science that I don’t often delve into, but here’s my scenario:

I’ve got an input of a string, split on spaces, say "pinto beans"

I’ve got a array of results to search, that contains results like:
["beans, mung","beans, pinto","beans, yellow","beans, fava"]

what might be the quickest way (preferably in javascript or php) to find the most “relevant” results, aka most matches, for instance, in the above case, I would like to sort the return array so that "beans, pinto" is put at the top, and the rest come below, and any other results would go below those.

My first attempt at this would be to do something like matching each result item against each input item, and incrementing matches on each one, then sorting by most matches to least.

This approach would require me to iterate through the entire result array a ton of times though, and I feel that my lack of CS knowledge is leaving me without the best solution here.

/* EDIT: Here’s how I ended up dealing with the problem: */

Based on crazedfred’s suggestion and the blog post he mentioned (which was VERY helpful), I wrote some php that basically uses a combination of the trie method and the boyer-moore method, except searching from the beginning of the string (as I don’t want to match “bean” in “superbean”).

I chose php for the ranking based on the fact that I’m using js libraries, and getting real benchmarks while using convenience functions and library overhead wouldn’t produce the testable results I’m after, and I can’t guarantee that it won’t explode in one browser or another.

Here’s the test data:

Search String: lima beans

Result array (from db): ["Beans, kidney","Beans, lima","Beans, navy","Beans, pinto","Beans, shellie","Beans, snap","Beans, mung","Beans, fava","Beans, adzuki","Beans, baked","Beans, black","Beans, black turtle soup","Beans, cranberry (roman)","Beans, french","Beans, great northern","Beans, pink","Beans, small white","Beans, yellow","Beans, white","Beans, chili","Beans, liquid from stewed kidney beans","Stew, pinto bean and hominy"]

First, I drop both the search string and the result array into php variables, after explode()ing the string on spaces.

then, I precompile my patterns to compare the results to:

$max = max(array_map('strlen',$input));
$reg = array();
for($m = 0; $m < $max; $m++) {
    $reg[$m] = "";
    for($ia = 0; $ia < count($input); $ia++) {
        $reg[$m]. = $input[$ia][$m];
    }
}

this gives me something like : ["lb","ie","ma","an","s"]

then, I basically take each result string (split on spaces), and match a case insensitive character class with the corresponding character number to it. If at any point during that comparison process I don’t get any matches, I skip the word. This means if only 1 result starts with “b” or “l”, I’ll only run one comparison per WORD, which is really fast. Basically I’m taking the part of trie that compiles the searches together, and the constant speedup of the Boyer-Moore stuff.

Here’s the php – I tried whiles, but got SIGNIFICANTLY better results with foreaches:

$sort = array();
foreach($results as $result) {
    $matches = 0;
    $resultStrs = explode(' ', $result);
    foreach($resultStrs as $r) {
        $strlen = strlen($r);
        for($p = 0; $p < $strlen; $p++) {
            if($reg[$p])
                preg_match('/^['.$reg[$p].']/i',$r[$p],$match);
            if($match==true) {
                $matches++;
            } else {
                break 2;
            }
        }
    }
    $sort[$result] = $matches;
}

That outputs an array with the results on the keys, and how many character matches we got in total on the values.

The reason I put it that way is to avoid key collisions that would ruin my data, and more importantly, so I can do a quick asort and get my results in order.

That order is in reverse, and on the keys, so after the above code block, I run:

asort($sort);
$sort = array_reverse(array_keys($sort));

That gives me a properly indexed array of results, sorted most to least relevant. I can now just drop that in my autocomplete box.

Because speed is the whole point of this experiment, here’s my results – obviously, they depend partially on my computer.

2 input words, 40 results: ~5ms
2 input words, (one single character, one whole) 126 results: ~9ms

Obviously there’s too many variables at stake for these results to mean much to YOU, but as an example, I think it’s pretty impressive.

If anyone sees something wrong with the above example, or can think of a better way than that, I’d love to hear about it. The only thing I can think of maybe causing problems right now, is if I were to search for the term lean bimas, I would get the same result score as lima beans, because the pattern isn’t conditional based on the previous matches. Because the results I’m looking for and the input strings I’m expecting shouldn’t make this happen very often, I’ve decided to leave it how it is for now, to avoid adding any overhead to this quick little script. However, if I end up feeling like my results are being skewed by it, I’ll come back here and post about how I sorted that part.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-19T13:42:15+00:00Added an answer on May 19, 2026 at 1:42 pm

    Since you specifically noted that it could be in several languages, I’ll leave my answer in pseudocode so you can adapt to the language of your choice.

    Since you are matching an array-to-array, performance is going to vary a lot based on your implementation, so trying several ways and considering exactly when/how/how-often this will be used in a good idea.

    The simple way is to leave the data as-is and run an O(n^2) search:

    for (every element in array X)
        for (every element in array Y)
            if (current X == current Y)
                add current X to results
    
    return results
    

    If you sorted the arrays first (a sorting algorithm such a squicksort is implemented for you in many languages, check your documentation!), then the actual matching is quicker. Use whatever string-comparison your language has:

    Sort array X
    Sort array Y
    
    Let A = first element of X
    Let B = first element of Y
    
    while (A and B are in array)
        if (A > B)
            Next B
        else if (A < B)
            Next A
        else  //match!
            Add A to results
            Next A
            Next B
    
    //handle case where one array is larger (trivial loop)
    
    return results
    

    Now the important part to the above solution is if the sorting of the arrays saved time versus just an ordinary O(n^2) sort. Usually, moving elements around in arrays is fast whereas string comparisons are not, so it may be worth it. Again, try both.

    Finally, there’s this crazy algorithm the Mailinator guy dreamed up for doing huge string comparisons in constant time using some awesome data structures. Never tried it myself, but it must work since he runs his whole site on very low-end hardware. He blogged about it here if you’re interested. (Note: the blog post is about filtering spam, so some words in the post may be slightly NSFW.)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm a new developer at my company and I do mostly front-end web development.
I mostly use lambda functions but sometimes use nested functions that seem to provide
I've managed to mostly ignore all this multi-byte character stuff, but now I need
I mostly use Java and generics are relatively new. I keep reading that Java
I've got a (mostly) working plugin developed, but since its function is directly related
I do mostly Java and C/C++ development, but I'm starting to do more web
C++ is mostly a superset of C, but not always. In particular, while enumeration
I'm mostly an oracle novice, so forgive me if this is a stupid question...
I'm quite confused by this one. I've build a read more button using mostly
Mostly for my amusement, I created a makefile in my $HOME/bin directory called rebuild.mk

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.