In a project that is almost a decade in the past we encoded diacitical marks in a MYSQL database with html entities. All of that seemes bizzare today but be it as it may. The application that uses this database implements a search function and my problem is that the search does not work correctly when the string searched for contains a diacritical mark. Like in: "für"
The simplified MySQL query looks like this:
SELECT kunstwerk. * , kategorie.published, kategorie.bezeichnung
FROM kunstwerk
LEFT JOIN kategorie ON SUBSTRING( kategorie, 1, 7 ) = kategorie_Nr
WHERE published = 'true'
AND published_veto <> 'false'
AND MATCH (titel_DE)
AGAINST (
'+für '
IN BOOLEAN MODE
)
ORDER BY kategorie
My problem is that it matches anything that contains "ü" regardless of the surrounding characters in "für".
What is the reason for that?
I’m not 100% sure what happens here, but mySQL is clearly choking on the
∧control characters. (my suspicion it’s ultimately searching only for theuumlpart).Anyway, wrapping the term in quotes will help: