I have an application that stores data in database. I need search functionality to work on this database.
For this to work I need a “relevance” score, a score that is calculated based on a set of criteria to output as a value that can be then used to order a set of data.
Say for instance the user enters three keywords: X, Y and Z – I need to generate a score based on a database entry. I wish the criteria to be related to how many times each appears.
Example:
Database Entry A – X appears 8 times Y appears once and Z appears once. Giving a collective score of 10.
Database Entry B – X appears 24 times Y does not appear and Z does not appear. Giving a collective score of 24.
Here’s my problem. Database Entry A IS more relevant based on the search of XYZ because it has all three database entries, not just one, yet a standard calculation would class Database Entry B as more relevant.
I need to figure out a way to calculate the results and give an number score to the result based on not just how many of each keyword appears, but also giving higher scores for those results that have more than one keyword displayed, exponentially (i.e. entering 10 keywords would show results where all 10 appear above ones with large amounts of one).
I need to achieve this with PHP which will be retrieving my database results and feeding them back to my website page.
You could compute two relevance scores. One that rates based on how many fields provided a match, and then your regular “how matches were found”. From your examples, that would provide:
and then have your query do
so that matches with more fields get sorted first.