When performing a query like:
select count(*) from myTextTable where tsv @@ plainto_tsquery('english', 'TERM');
I’ve noticed that PostgreSQL does not use the GIN index (that I defined on the tsv column) when TERM is 1 or 2 characters long; 3 or more characters work fine.
I understand that by indexing 1 or 2 character terms, the size of the index will increase vastly but retrieving texts containing specific 1 or 2 character terms in a fast way is essential for the application I’m developing.
Is there some full text search configuration parameter to index 1- or 2-character terms?
This issue has been solved now by (a) removing lots of noisy text from the pages (using language detection) and (b) dropping/re-creating the GIN index. My guess is that the noisy text caused an explosion in the number of lexemes and that the index became unusable, or was classified as such by the query planner. –