I have table with words dictionary in my language (latvian). CREATE TABLE words (

Question

0

Asked: May 14, 20262026-05-14T14:25:58+00:00 2026-05-14T14:25:58+00:00

I have table with words dictionary in my language (latvian). CREATE TABLE words (

0

I have table with words dictionary in my language (latvian).

CREATE TABLE words ( value varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

And let’s say it has 3 words inside:
INSERT INTO words (value) VALUES ('tēja'); INSERT INTO words (value) VALUES ('vējš'); INSERT INTO words (value) VALUES ('feja');

What I want to do is I want to find all words that is exactly 4 characters long and where second character is ‘ē’ and third character is ‘j’

For me it feels that correct query would be:
SELECT * FROM words WHERE value LIKE '_ēj_';
But problem with this query is that it returs not 2 entries (‘tēja’,’vējš’) but all three.
As I understand it is because internally MySQL converts strings to some ASCII representation?

Then there is BINARY addition possible for LIKE
SELECT * FROM words WHERE value LIKE BINARY '_ēj_';
But this also does not return 2 entries (‘tēja’,’vējš’) but only one (‘tēja’). I believe this has something to do with UTF-8 2 bytes for non ASCII chars?

So question:
What MySQL query would return my exact two words (‘tēja’,’vējš’)?

Thank you in advance

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T14:25:58+00:00

What MySQL query would return my exact two words (‘tēja’,’vējš’)?

SELECT * FROM words WHERE value LIKE '_ēj_' COLLATE utf8_bin;

The utf8_bin collation is not just diacritical-sensitive, but also case-sensitive. If you want to match only the letter-with-diacritical and you don’t care about upper/lower case, you would have to find a utf_..._ci collation that doesn’t treat e and ē as the same letter.

I can’t immediately see one (there are plenty that don’t collate ē at all, which would be okay if you only need case-sensitive matching on the non-diacritical letters). Interesting that the Latvian collation treats macron-letters as the same as plain letters, which you don’t want (it knows š is different from s).

Anyway, whatever collation you end up with, you will want to put your tables in that collation rather than manually specifying it in a query, so that comparisons can be properly indexed.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have table with words dictionary in my language (latvian). CREATE TABLE words (

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply