I’m currently working on an IOS app that will have fulltext search. The search is performed by performing a select statement on a sqlite database, but the problem is that many of the values in the database contain Scandinavian letters (Æ, Ö, Á etc) and I’m having trouble converting the statement without getting hex values for the letters.
Here is what I’m currently doing:
const char *sql = [[NSString stringWithFormat:
@"SELECT %@\
FROM Customer c\
JOIN Customer_Metadata cm ON c.CustomerId = cm.CustomerId\
WHERE cm.Name LIKE '%%%@%%'\
ORDER BY cm.Name", kCustomerSelect, searchString] UTF8String];
Now kCustomerSelect is a constant containing the columns I want to select, and searchString contains the user input.
This works like a charm for normal Latin letters, but if I for instance pass along Ö, as a searchString, I get st\xc3\xb6. I am aware that simply logging out a UTF8 encoded string will not give correct representation of the string, but the problem is that my select statement isn’t giving me any results.
I’m calling sqlite3_open() before executing the query and from the SQLite documentation *”The default encoding for the database will be UTF-8 if sqlite3_open() or sqlite3_open_v2() is called and UTF-16 in the native byte order if sqlite3_open16() is used.”*.
I’ve tried using different encodings by replacing UTF8String with cStringUsingEncoding: and trying out different encodings. None of them have worked (not that I was expecting them, but at least wanted to try).
Any and all help, or tips, would be appreciated.
Edit
I’ve now tried using SQLite Database Browser to run the same select statement on the database and am not getting any results.
This is leading me to believe that this might have something to do with me using FTS3 to create my Customer_Metadata table.
MrDresden
If you’re using FTS3, the default tokenizer isn’t going to do what you want.
— http://www.sqlite.org/fts3.html#tokenizer
You’ll need to use a custom tokenizer, or see if the icu or unicode61 tokenizers will work for you. Info about those are in the above linked doc.