I have a table containing the runtimes for generators on different sites, and I want to select the most recent entry for each site. Each generator is run once or twice a week.
I have a query that will do this, but I wonder if it’s the best option. I can’t help thinking that using WHERE x IN (SELECT …) is lazy and not the best way to formulate the query – any query.
The table is as follows:
CREATE TABLE generator_logs ( id integer NOT NULL, site_id character varying(4) NOT NULL, start timestamp without time zone NOT NULL, 'end' timestamp without time zone NOT NULL, duration integer NOT NULL );
And the query:
SELECT id, site_id, start, 'end', duration FROM generator_logs WHERE start IN (SELECT MAX(start) AS start FROM generator_logs GROUP BY site_id) ORDER BY start DESC
There isn’t a huge amount of data, so I’m not worried about optimizing the query. However, I do have to do similar things on tables with 10s of millions of rows, (big tables as far as I’m concerned!) and there optimisation is more important.
So is there a better query for this, and are inline queries generally a bad idea?
I would use joins as they perform much better then ‘IN’ clause:
Also as Tony pointed out you were missing correlation in your original query