I have a query that make some joins.
SELECT feedback.note as notes,
to_char(feedback.data_compilazione,'DD/MM/YYYY') as date,
CAST(feedback.punteggio AS INT) as score,
upper(substring(cliente.nome from '^.') || '.' ||
substring(cliente.cognome from '^.') || '.') as customer,
testo.testo as nation FROM feedback
JOIN prenotazione ON prenotazione.id = feedback.id_prenotazione
JOIN cliente ON cliente.id = prenotazione.id_cliente
JOIN struttura ON struttura.id = prenotazione.id_struttura
JOIN lingua ON cliente.id_lingua = lingua.id
JOIN nazione ON cliente.codice_nazione = nazione.codice_iso
JOIN testo ON testo.nome_tabella = 'nazione' AND testo.id_record = nazione.id AND
testo.id_lingua = lingua.id AND testo.id_tipo_testo = 1 WHERE struttura.id = 43 AND
lingua.sigla = E'en' AND feedback.punteggio >= 3 AND feedback.note <> ''
ORDER BY feedback.data_compilazione DESC LIMIT 5
My problem is that I haven’t any explicit index onto my tables.
This means that this query took a long,long,long….long time to be executed.
AFAIK postresql creates an “implicit” index every time a declare a primary key, so I haven’t to add it “explicitly”.
This is the EXPLAIN of the query
"Limit (cost=212.72..212.73 rows=1 width=208)"
" -> Sort (cost=212.72..212.73 rows=1 width=208)"
" Sort Key: feedback.data_compilazione"
" -> Nested Loop (cost=1.11..212.71 rows=1 width=208)"
" -> Nested Loop (cost=1.11..206.86 rows=1 width=212)"
" Join Filter: (("outer".codice_nazione)::text = ("inner".codice_iso)::text)"
" -> Nested Loop (cost=1.11..201.63 rows=1 width=223)"
" -> Nested Loop (cost=1.11..195.60 rows=1 width=187)"
" Join Filter: ("outer".id = "inner".id_cliente)"
" -> Nested Loop (cost=1.11..45.18 rows=1 width=183)"
" Join Filter: ("outer".id_lingua = "inner".id_lingua)"
" -> Index Scan using testo_pkey on testo (cost=0.00..6.27 rows=1 width=40)"
" Index Cond: (((nome_tabella)::text = 'nazione'::text) AND (id_tipo_testo = 1))"
" -> Hash Join (cost=1.11..38.86 rows=4 width=155)"
" Hash Cond: ("outer".id_lingua = "inner".id)"
" -> Seq Scan on cliente (cost=0.00..33.47 rows=847 width=151)"
" -> Hash (cost=1.11..1.11 rows=1 width=4)"
" -> Seq Scan on lingua (cost=0.00..1.11 rows=1 width=4)"
" Filter: ((sigla)::text = 'en'::text)"
" -> Seq Scan on prenotazione (cost=0.00..150.05 rows=30 width=12)"
" Filter: (43 = id_struttura)"
" -> Index Scan using feedback_id_prenotazione_key on feedback (cost=0.00..6.01 rows=1 width=44)"
" Index Cond: ("outer".id = feedback.id_prenotazione)"
" Filter: ((punteggio >= 3::double precision) AND (note <> ''::text))"
" -> Index Scan using nazione_pkey on nazione (cost=0.00..5.21 rows=1 width=11)"
" Index Cond: ("outer".id_record = nazione.id)"
" -> Index Scan using struttura_pkey on struttura (cost=0.00..5.82 rows=1 width=4)"
" Index Cond: (id = 43)"
So I stopped to think about indexing on DB.
What is the best practise for creating index?
The solution is: create an index for every joined field?
And into my DB, what you suggest to index?
I’ve done some try (substantially index every field that is joined) and the query run now faster but not fast (about six second to retrive zero rows).
I suppose that mine solution isn’t the best.
Anybody can point my look at the right direction?
EDIT
If I add just an index (on prenotazione.id_cliente) all works in very few seconds (1,5 about). So, why add all indexes on FOREIGN keys make my query run slower?
As a first rule of thumb, without even looking at the query plan, I would certainly put indexes on foreign keys that have a high (>100 distinct values) selectivity.
Some databases put them without asking, Postgres doesn’t.
We are talking about indexes on FOREIGN keys, not PRIMARY ones (these are obviously always provided).
For instance,
prenotazione.id_struttura, andfeedback.id_prenotazioneare likely candidates.On the other hand, a
weekdaycolumn of 7 distinct values would not benefit from an index, unless there are thousands of mondays and very few sundays, in which case the index is selective for some values.