I have a table named Info of this schema:
int objectId; int time; int x, y;
There is a lot of redundant data in the system – that is, objectId is not UNIQUE. For each objectId there can be multiple entries of time, x, y.
I want to retrieve a list of the latest position of each object. I started out with this query:
SELECT * FROM Info GROUP BY objectId
That got me just the kind of list I was looking for. However I want also to get just the latest times for each Object, so I tried:
SELECT * FROM Info GROUP BY objectId ORDER BY time DESC
This gave me a time descended list of Infos. However, I don’t think it did what I want – that is return me the latest time, x, y for each object.
Can anyone imagine a query to do what I want?
Update I have tried the top three solutions to see how they perform against each other on a dataset of about 50,000 Infos. Here are the results:
-- NO INDEX: forever -- INDEX: 7.67 s SELECT a.* FROM Info AS a LEFT OUTER JOIN Info AS b ON (a.objectId = b.objectId AND a.time < b.time) WHERE b.objectId IS NULL; -- NO INDEX: 8.05 s -- INDEX: 0.17 s select a.objectId, a.time, a.x, a.y from Info a, (select objectId, max(time) time from Info group by objectId) b where a.objectId = b.objectId and a.time = b.time; -- NO INDEX: 8.30 s -- INDEX: 0.18 s SELECT A.time, A.objectId, B.x, B.y FROM ( SELECT max(time) as time, objectId FROM Info GROUP by objectId ) as A INNER JOIN Info B ON A.objectId = b.objectId AND A.time = b.time;
By a margin, it would seem where outperforms inner join.
One way is using a subquery.
EDIT: Added DISTINCT to prevent duplicate rows if one objectId has multiple records with the same time. Depends on your data if this is necessary, the question author mentioned there were many duplicate rows. (added by Tomalak)