I’m trying to group by multiple columns here – one on each table.
It’s a scenario where I want to find the top portfolio value for each client by adding their current portfolio and cash together but a client may have more than one portfolio, so I need the top portfolio for each client.
At the moment, with the code below I’m getting the same clients multiple times for each of their top portfolios (it’s not grouping by client id).
SELECT clients.id, clients.name, portfolios.id, SUM ( portfolios.portfolio + portfolios.cash ) AS total
FROM clients, portfolios
WHERE clients.id = portfolios.client_id
GROUP BY portfolios.id, clients.id
ORDER BY total DESC
LIMIT 30
First, let’s make some test data:
If you didn’t need the portfolio ID, it would be easy:
Since you need the portfolio ID, things get more complicated. Let’s do it in steps. First, we’ll write a subquery that returns the maximal portfolio value for each client:
Then we’ll query the portfolio table, but use a join to the previous subquery in order to keep only those portfolios the total value of which is the maximal for the client:
Finally, we can join to the client table (as you did) in order to include the name of each client:
Note that this returns two rows for John Doe because he has two portfolios with the exact same total value. To avoid this and pick an arbitrary top portfolio, tag on a GROUP BY clause: