I need to retrieve certain rows from a table depending on certain values in a specific column, named columnX in the example:
select *
from tableName
where columnX similar to ('%A%|%B%|%C%|%1%|%2%|%3%')
So if columnX contains at least one of the values specified (A, B, C, 1, 2, 3), I will keep the row.
I can’t find a better approach than using similar to. The problem is that the query takes too long for a table with more than a million rows.
I’ve tried indexing it:
create index tableName_columnX_idx on tableName (columnX)
where columnX similar to ('%A%|%B%|%C%|%1%|%2%|%3%')
However, if the condition is variable (the values could be other than A, B, C, 1, 2, 3), I would need a different index for each condition.
Is there any better solution for this problem?
EDIT: Thanks everybody for the feedback. Looks like I’ve achieved to this point maybe because of a design mistake (topic I’ve posted in a separated question).
I agree with Quassnoi, a GIN index is fastest and simplest – unless write performance or disk space are issues, because it occupies a lot of space and adds cost for
INSERT,UPDATEandDELETE.My additional answer is triggered by your statement:
If that is what you found, then your search isn’t over, yet.
SIMILAR TOis a complete waste of time. Literally. Postgres only includes it to comply to the (weird) SQL standard. Inspect the output ofEXPLAIN ANALYZEfor your query and you will find thatSIMILAR TOhas been replaced by a regular expression.Internally every
SIMILAR TOexpression is rewritten to a regular expression. Consequently, for each and everySIMILAR TOexpression there is at least one regular expression match that is a bit faster. LetEXPLAIN ANALYZEtranslate it for you, if you are not sure. You won’t find this in the manual, PostgreSQL does not promise to do it this way, but I have yet to see an exception.Further reading: