I’m implementing an autocomplete that will allow a user to enter in partial text that will then be matched against 4 different columns in a table. Here is a basic example:
+------------+-----------+--------+-----------------+
| first_name | last_name | login | email |
+------------+-----------+--------+-----------------+
| John | Smith | jsmith | jsmith@foo.bar |
| Johnny | Ringo | ringo | ringer@hmm.okay |
| Bob | Jones | bjones | j1234@xyz.abc |
| Jane | Doe | doej | doedoe@blah.umm |
+------------+-----------+--------+-----------------+
When the user enters “jo”, I want to match the records from this table where at least one of those four columns matches the pattern “jo%”. For this example, only the first two would match due to their first_name column values. If the search were “js”, then only the first record would match due to its login and email column values. And so on. I’d also like to return the results ordered by similarity, where the first result is the “closest match”, and so on down the results (standard autocomplete behavior).
I’ve currently been trying to solve this problem using UTL_MATCH, and code that produces the following query:
SELECT first_name,
last_name,
login,
email,
( utl_match.jaro_winkler_similarity(first_name, 'js')
+ utl_match.jaro_winkler_similarity(last_name, 'js')
+ utl_match.jaro_winkler_similarity(login, 'js')
+ utl_match.jaro_winkler_similarity(email, 'js')) similarity
FROM users
WHERE LOWER(first_name) LIKE LOWER('js%')
OR LOWER(last_name) LIKE LOWER('js%')
OR LOWER(login) LIKE LOWER('js%')
OR LOWER(email) LIKE LOWER('js%')
ORDER BY similarity DESC
The results aren’t as accurate as I’d like them to be, and I’ve seen autocompletes in the wild that work the way I’d like mine to work, but have no idea how they’re implemented on the back-end.
Can anyone point me in the right direction?
Searching is always fun. You’ve started a very good basic approach. What I usually do is create a secondary table loaded via trigger. The trigger load a primary key from your users table and the second column is the “search” column. In your example, every row would yield 4 rows. Make sure you store your results in upper case, so you can index the columns and Oracle will use the index since you are sticking with your like syntax. It requires an extra table, however the table can be maintained via triggers.