My system stores products from many different e-shops and I need to pair products according to their names. For example:
INPUT: MySQL table products
id | name | id_seller
1 porsche 911 red edition 1
2 red porsche 911 gt 2
3 icecream 1
DESIRED OUTPUT: Suggestion that product 1 is similar to product 2.
In the first step it would be sufficient to make suggestions just on the number of common words – 3 out of 4 in this Porsche example.
More sophisticated solution would involve comparing the order of words not just their occurrences, but I guess it wouldn’t be trivial.
Can it be done using just MySQL query and its built-in functions or any sophisticated library/add-on has to be used?
Here is a SQLFiddle example to find pairs of products with at least one common word in the
namecolumn:If you need to find lines with at least N common words you should create tmp table splitting each row to words. Here is an example and stored procedure to do it. For your example this table looks like:
in this case you can use following query to find id’s with at least N common words (in this case N=3):