Say I have a table “transactions” that has columns “acct_id” “trans_date” and “trans_type” and I want to filter this table so that I have just the last transaction for each account. Clearly I could do something like
SELECT acct_id, max(trans_date) as trans_date
FROM transactions GROUP BY acct_id;
but then I lose my trans_type. I could then do a second SQL call with my list of dates and account id’s and get my trans_type back but that feels very cludgy since it means either sending data back and forth to the sql server or it means creating a temporary table.
Is there a way to do this with a single query, hopefully a generic method that would work with mysql, postgres, sql-server, and oracle.
This is an example of a greatest-n-per-group query. This question comes up several times per week on StackOverflow. In addition to the subquery solutions given by other folks, here’s my preferred solution, which uses no subquery,
GROUP BY, or CTE:In other words, return a row such that no other row exists with the same
acct_idand a greatertrans_date.This solution assumes that
trans_dateis unique for a given account, otherwise ties may occur and the query will return all tied rows. But this is true for all the solutions given by other folks too.I prefer this solution because I most often work on MySQL, which doesn’t optimize
GROUP BYvery well. So this outer join solution usually proves to be better for performance.