I’ve looked all over the web only to find ways to make searches insensitive to diacritic characters, but I need the opposite.
In my situation I need to be able to compare specifically symbols with superscript and subscript dots (i.e. ȧ & ạ) and some other more common acents (á, ã, etc), but these letters might be anything (ṡ, ṛ, ṫ, ḍ, ṅ, etc). The desired result would work like this: if I search for “a” I receive “a” only, and if I search for “ȧ”, I receive “ȧ” only as a result and not “a” along with it (without the dot).
I’ve read that I need to use utf8_bin and have tried changing both my field collations, table collations and database collations to that with no success. Here’s the code:
// "sound" is being passed in by an AJAX call
$sound = $_POST['sound'];
$query = "SELECT * FROM sounds WHERE 'sound' = '$sound'";
$result = mysql_query($query);
// This is then sent back to my page.
I’ve also looked into COLLATE with little success. I’m probably misunderstanding its prober usage:
// Attempting to covert the searched string into the utf8_bin format to match my db collations
$query = "SELECT * FROM sounds WHERE 'sound' = '$sound' COLLATE utf8_bin";
When I use utf8_general_ci or utf8_unicode_ci I get the excepted result of “a” or “ȧ” returning both “ȧ” and “a”. However, If I use utf8_bin I get nothing when searching for either of these. I believe this is because in my database when using utf8_bin this – “ṅ(PH)” (one of my entries) – gets converted to this – “e1b98528504829”. So Is there a way of converting my searches to that same format before querying them? Or just an all around better way of making this work?
Thank you!
Okay, with a little help from a friend I got it working. Turns out it works just fine with utf8_general_ci collations as well.
My first issue was with how I had entered my data into my database. I had used phpMyAdmin to do this, which for some reason didn’t encode the data properly and all my bins were turning out wrong. This was fixed by just writing my own sql to enter the values.
Second, I ended up using the PHP function iconv() to encode the data coming from the web page. These two solutions put together got matching values and the entire script works great.
Thanks all for the help and suggestions… really appreciated, and believe me, did not go to waste. I spent a long time fiddling with all of them.
Cheers!