I’d like to be able to use php search an array (or better yet, a column of a mysql table) for a particular string. However, my goal is for it to return the string it finds and the number of matching characters (in the right order) or some other way to see how reasonable the search results are, so then I can make use of that info to decide if I want to display the top result by default or give the user options of the top few. I know I can do something like
$citysearch = mysql_query(' SELECT city FROM $table WHERE city LIKE '$city' ');
but I can’t figure out a way to determine how accurate it is.
The goal would be:
a) find ‘Milwaukee’ if the search term were ‘milwakee’ or something similar.
b) if the search term were ‘west’, return things like ‘West Bend’ and ‘Westmont’.
Anyone know a good way to do this?
More searching led me to the Levenshtein distance and then to similar_text, which proved to be the best way to do this.
compares the strings and then saves the accuracy as a variable. The Levenshtein distance determines how many delete, insert, or replace functions on a single character it would need to do to get from one string to the other, with an allowance for weighting each function differently (eg. you can make it cost more to replace a character than to delete a character). It’s apparently faster but less accurate than similar_text. Other posts I’ve read elsewhere have mentioned that for strings of fewer than 10000 characters, there’s no functional difference in speed.
I ended up using a modified version of something I found to make it work. This ends up saving the top 3 results (except in the case of an exact match).