I have a stored procedure that runs to update gaming points for user balances. It’s an insert with 5 subqueries. I have isolated one of the subqueries as the query that slows the entire batch down. Without it, the stored procedure will run in under 2 seconds. With it, it will take as much as 8 seconds. 8 Seconds isn’t the end of the world, but for the sake of scalability, I will need to have it complete faster. Here is the isolated subquery:
(SELECT IsNull(Sum(A.TransAmount) + Sum(Case When A.BetResult = 1 Then (A.BetWinAmount + (A.TransAmount * -1)) End), 0)
FROM User_T A
LEFT OUTER JOIN User_TD B on A.TID = B.TID
LEFT OUTER JOIN Lines_BL C ON B.LID = C.LID
LEFT OUTER JOIN Lines_BM D ON C.BMID = D.BMID
LEFT OUTER JOIN Event_M E ON D.EID = E.EID
LEFT OUTER JOIN Event_KB F ON A.TransReason = F.BID
LEFT OUTER JOIN Event_M G ON F.BID = G.EID
where A.UserID = U.UserID AND (A.IsSettled = 1)
AND
(
(A.TransReason = 1 AND (datediff(dd, Convert(datetime, E.EDate, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo)) OR
(A.TransReason >= 3000 AND (datediff(dd, Convert(datetime, G.EDate, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo)
AND [dbo].[Event_CEAFKBID](A.TransReason) = 1) OR
(A.TransReason BETWEEN 3 and 150 AND (datediff(dd, Convert(datetime, A.TransDT, 101), Convert(datetime, @EndDate, 101)) = @DaysAgo))
)
What I have done to further isolate: When I run a Select * on just the joins (without the where clauses), the performance in very good – > 100000 rows in under a second. As I add in the where clauses, I believe the great slow down is from the ‘or’ clause and/or the function that needs to be evaluated.
As I understand it, a function inside the where clause evaluates each row – as opposed to somehow caching the definition of the function and evaluating that way. I do have indexes on the tables, but I am wondering if some of them are not correct.
Without you knowing the full database structure, I am sure it’s very difficult to pin down where the problem is, but I would like to get pointed in a direction to begin to further isolate.
I suspect your biggest performance hits are from the correlated subquery (whatever table is behind U.UserId) and from the embedded function call dbo.Event_CEAFKBID. Much of course depends upon how big the tables are (how many rows are being read). All those datetime conversions won’t help and generate a very strong “bad design” smell, but I don’t think they’d impact performance too much.
Those left outer joins are ugly, as the optimizer has to check them all for row – so if “A” is big, all the joins on all the rows have to be performed, even if there’s no data there. If they can be replaced with inner joins, do so, but I’m guessing not because of that “table E or table G” logic. Lesses, it sure looks like what you’ve got is three separate queries moshed into one; if you broke it out into three, unioned together, it’d look something like the Frankenstein query below. I’ve no idea if this would run faster or not (heck, I can’t even debug the query and make sure the panetheses balance), but if you’ve got sparse data relative to your logic this should run pretty fast. (I took out the date conversions to make the code more legible, you’d have to plug them back in.)