I’m trying to speed this query up as much as possible. It’s not terribly

Question

0

Asked: May 25, 20262026-05-25T03:12:22+00:00 2026-05-25T03:12:22+00:00

I’m trying to speed this query up as much as possible. It’s not terribly

0

I’m trying to speed this query up as much as possible. It’s not terribly slow, but I need it be as fast as can be.

SELECT name
FROM (
  SELECT cities.name || ', ' || regions.name || ', ' || countries.code AS name
  FROM cities
  INNER JOIN regions ON regions.id = cities.region_id
  INNER JOIN countries ON countries.id = regions.country_id
) AS t1
GROUP BY name
HAVING LOWER(name) ILIKE 'asheville%'
ORDER BY name ASC
LIMIT 10;

These indexes exist:

UNIQUE INDEX index_cities_on_name_and_region_id ON cities USING btree (name, region_id)
UNIQUE INDEX index_countries_on_code ON countries USING btree (code)
UNIQUE INDEX index_countries_on_name ON countries USING btree (name)
UNIQUE INDEX index_regions_on_code_and_country_id ON regions USING btree (code, country_id)

cities table contains 248016 records.
countries table contains 252 records.
regions table contains 4005 records.

Here’s the explain output of the query: http://explain.depesz.com/s/fWe

Any help would be greatly appreciated. Basically I’m just looking for suggestions or maybe pointing out something I may have missed.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T03:12:23+00:00

In your subquery, you should return both the name you are already returning, and cities.name as cname. Then, you should do your ilike on cname instead of on name. The issue is that right now there is no way that PostgreSQL could really be expected to infer that as 'ashville%' doesn’t have any commas in it, it can just look at the city name in the subquery, so it is really having to (and is, based on your explain) iterate and build every single possible string in order to do that final filter. If you return cities.name back to the upper query it will dramatically improve the performance, as right now it seriously can’t use any of those indexes you have.

Really, you should go all the way here, and just remove the string concatenation inside of the query and return what you really intended: select cities.name as city, regions.name as region, countries.code as country, and modify sort to order by t1.city, t1.region, t1.country.

Additionally, are you really asking for cities that have 'ashville%', or is this just an indirect way of looking for cities that are 'ashville', but you have to deal with the comma delineation internally? Then, outside, use lower(t1.city) = 'ashville' (note the =: lower(x) ilike 'lower' is pointlessly slow).

Also, you need to fix these indexes: what you really want is create index whatever on cities((lower(name))), as that’s what you are actually searching by, not name: there is no way that you can ever get these indexes to be used if you are searching for something that is unrelated to what you have in the index.

(You might look at the order by name later and be concerned that it won’t be accelerated anymore, but that’s ok: the goal here is to rapidly filter from the tons of possible locations down to the tiny set of ones that you are going to operate on; what’s left over can be rapidly sorted in memory, as you are probably dealing with 10-20 results.)

Due to this, as regions.id and countries.id are probably primary keys, the other indexes can be deleted if they are only there for this query.

Finally, flatten the query to a single level, remove the group by, and replace it with a distinct. The issue is that we want to make certain that we don’t force PostgreSQL to generate the full set before attempting the filter: we want to make certain it has enough knowledge of the goal to be able to use the city index to rapidly scan directly to the cities that can match, and then get around to filling in the region and country information.

(PostgreSQL is generally very very good at doing this, even through a sub-query, but as we have a group by clause through a having, I can see situations where it will no longer be able to infer.)

(edit) Actually, wait: you have a unique index on cities (name, region_id), so you don’t even need the distinct… all it is doing is making the query pointlessly more complex. I just went ahead and removed it from the query: the result will be the same, as you cannot possibly end up with a result where you have the same city in the same region/country being returned twice.

select
    cities.name as city,
    regions.name as region,
    countries.code as country
from cities
join regions on
    regions.id = cities.region_id
join countries on
    countries.id = regions.country_id
where
    lower(cities.name) = 'asheville'
order by
    cities.name,
    regions.name,
    countries.code
limit 10;

create index "cities(lower(name))" on cities ((lower(name)));

(edit) If, by the way, you actually are meaning to do a prefix match, then you will want to change the = 'asheville' back to a like 'ashevill%' (note the like: no i), and change the index to be specified as follows:

create index "cities(lower(name))" on cities ((lower(name)) text_pattern_ops);

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to speed this query up as much as possible. It’s not terribly

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply