As “everyone knows”, you can’t return non-grouped non-aggregated columns in a GROUP BY, in other words, “give me the ID, name and address of the employee with the highest salary in each department.” Of course this isn’t quite true: http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html But this contains a rather ominous warning:
The server is free to choose any value from each group, so unless they
are the same, the values chosen are indeterminate.
MySQL has another article on this problem: http://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html But the technique recommended there doesn’t actually take advantage at all of hidden columns. There’s a comment in that article from Kasey Speakman, who recommends using an ordered subquery, like so:
select deptno, emp_id, address, name from
(select * from emp order by salary desc)
group by deptno
My questions are: a) Can I safely rely on MySQL to pick the “first” row from each group, since the subquery is ordered, and b) in general, and assuming appropriate indexes, is this likely to perform better than, say, the LEFT JOIN technique mentioned in the same article?
There was recently a discussion on a similar question here: SQL: What is the default Order By of queries?
But, nevertheless, I think ranking queries are the example of the queries in MySQL where it is quite useful to rely on the predicted order (predicted by using specific indexes).
Look at my answer to the following question: Retrieving the last record in each group
That is the answers to your questions:
yes, sometimes you can rely on the order when you know the engine and the indexes used, though it is not usually friendly accepted
when there are many items within each group the LEFT JOIN solution might take too long to execute, so that relying on the bare indexes might become almost the only solution. But the solution should not generate huge intermediate temporary tables.
But your query:
is the worst possible idea, since it generates an unindexed copy of your table and operates on it making no use of any optimizations.