I have a table that stores car prices with a structure like:
+-----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| model | varchar(255) | YES | | NULL | |
| make | varchar(255) | YES | | NULL | |
| year | varchar(255) | YES | | NULL | |
| avg_price | decimal(8,2) | YES | | NULL | |
| median_price | decimal(8,2) | YES | | NULL | |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
+-----------------+--------------+------+-----+---------+----------------+
The data for a given year might be inserted at different times.
For example model 'Honda', make 'Accord' might have results like:
+------+-----------+--------------+--------+---------------------+
| year | avg_price | median_price | model | created_at |
+------+-----------+--------------+--------+---------------------+
| 1992 | 2431.29 | 2000.00 | accord | 2012-02-23 17:31:41 |
| 1993 | 2609.13 | 2195.00 | accord | 2012-02-23 17:31:44 |
| 1994 | 2858.81 | 2400.00 | accord | 2012-02-23 17:31:44 |
| 2000 | 4771.99 | 4450.00 | accord | 2012-02-23 17:31:46 |
| 2001 | 5260.16 | 5000.00 | accord | 2012-02-23 17:31:46 |
| 2000 | 4860.19 | 4795.00 | accord | 2012-08-15 06:09:52 |
| 2001 | 5071.49 | 4990.00 | accord | 2012-08-15 06:09:52 |
| 2002 | 5872.80 | 5795.00 | accord | 2012-08-15 06:09:52 |
| 2003 | 7521.44 | 7950.00 | accord | 2012-08-15 06:09:52 |
| 2004 | 8348.19 | 8495.00 | accord | 2012-08-15 06:09:52 |
I would like to retrieve all honda accord data that is the latest for that year of car make.
so in the above example, I would like to retrieve data from 2012-08-15 06:09:52 for years 2000,2001,2002,2003,2004
but the older years would be from the date 2012-02-23 17:31:41
select year,avg_price,median_price,model,created_at
from car_prices
where make='honda' and model= 'accord' group by year asc
The above query gets distinct data for each year but not the last record inserted for each year.
Any ideas how to get distinct data for each year as well as the latest?
For both performance and guarnanteed behaviour, you should create a lookup and join on that.
In your case you want to find the most recent
created_atvalue for any givenmake, model, yeargroup. The sub-query in the code below does that.Then you join that back on your original data again, finding only the records that have those
make, model, year, created_atvalues.This does mean that it you have more than one record with the same
make, model, year, created_atvalues, you will get multiple results for thatmake, model, year.Ensure that you have an index covering
(make, model, year, created_at)to make the search for the most recentcreated_atquick, as well as the join.