I have my search term:
"Yellow large widgets"
I split the terms into its 3 words:
1 = "Yellow";
2 = "Large";
2 = "Widgets";
I then search with:
SELECT * FROM widgets
WHERE (description LIKE '%yellow%' OR description LIKE '%large%' OR description LIKE 'widgets')
OR (title LIKE '%yellow%' OR title LIKE '%large%' OR title LIKE '%widgets%')
How can I sort the results with these biases?
- Title takes presidence, if any of the terms appear in title they should be considered more important
- Occurrence counts, results with higher total occurances should appear first
Ideal Methodoloy
- Count occurrences in
description. - Each occurrence here is worth
1 point. - Count occurrences in
title. - Each
titleoccurrence is worth5 points. - Sort by points.
But I wouldn’t know where to start doing that in SQL.
Okay, let’s have your search terms in a temp table:
And let’s do something silly:
We’ve now counted each occurrence of each term, in both the description and the title.
So now we can sum and weight those occurrences:
And if we need to do more with this, I’d recommend putting the above in a subselect
(Note that I’ve assumed there’s an ID column in all these examples, but that can be expanded into as many columns as are necessary to define the PK in the widgets table)
The real trick here is counting occurrences of a word within a larger body of text, which is done by: