The following query gets the info that I need. However, I noticed that as the tables grow, my code gets slower and slower. I’m guessing it is this query. Can this written a different way to make it more efficient? I’ve heard a lot about using joins instead of subqueries, however, I don’t “get” how to do it.
SELECT * FROM
(SELECT MAX(T.id) AS MAXid
FROM transactions AS T
GROUP BY T.position
ORDER BY T.position) AS result1,
(SELECT T.id AS id, T.symbol, T.t_type, T.degree, T.position, T.shares, T.price, T.completed, T.t_date,
DATEDIFF(CURRENT_DATE, T.t_date) AS days_past,
IFNULL(SUM(S.shares), 0) AS subtrans_shares,
T.shares - IFNULL(SUM(S.shares),0) AS due_shares,
(SELECT IFNULL(SUM(IF(SO.t_type = 'sell', -SO.shares, SO.shares )), 0)
FROM subtransactions AS SO WHERE SO.symbol = T.symbol) AS owned_shares
FROM transactions AS T
LEFT OUTER JOIN subtransactions AS S
ON T.id = S.transid
GROUP BY T.id
ORDER BY T.position) AS result2
WHERE MAXid = id
Your code:
Notice the
[<---- here ]marks I added to your code.The first
Tis not in any way related to the secondT. They have the same correlation alias, they refer to the same table, but they’re entirely independent selects and results.So what you’re doing in the first, uncorrelated, subquery is getting the max id for all
positionsintransactions.And then you’re joining all
transaction.position.max(id)s toresult2(whichresult2happens to be a join of alltransaction.positions tosubtransactions). (And the internalorder byis pointless and costly, too, but that’s not the main problem.)You’re joining every
transaction.position.max(id)to every (whatever result 2 selects).On Edit, after getting home: Ok, you’re not Cartesianing, the “where MAXid = id” does join
result1toresult2. But you’re still rolling up all rows oftransactionin both queries.So you’re getting a Cartesian join — everyresult1joined to everyresult2, unconditionally (nothing tells the database, for example, that they ought to be joined by (max) id or by position).So if you have ten unique
position.max(id)s intransaction, you’re getting 100 rows. 1000 unique positions, a million rows. Etc.When you want to write a complicated query like this, it’s a lot easier if you compose it out of simpler views. in particular, you can test each view on its own, to make sure you’re getting reasonable results, and then just join the views.