I have a SQLite database with a word list. In a table there is a word list that includes the word “você”. This word has this representation in unicode “voc\U00ea”.
I’ve found out that the same word can have the following representation with the same visual output:
"voc\U00ea",
"voce\U0302"
When I query my db using the second representation it returns blank. Does anyone know a way for the query work using both representations without duplicating the records in the table?
Thanks,
Miguel
These two forms are known as nfc(normal form composed) and nfd(“normal form decomposed”). The letter
\U0302is known as a “combining circumflex”, which modifies a preceding letter.To cope with this situation, do the following:
precomposedStringWithCanonicalMappingorprecomosedStringWithCompatibilityMapping. To understand the difference between canonical and compatibility mappings, see this description.