Yesterday I asked a question about CTEs and running total calculations;
Calculating information by using values from previous line
I came up with a solution, however when I went to apply it to my actual database (over 4.5 million records) it seems to be taking forever. It ran for over 3 hours before I stopped it. I then tried to run it on a subset (CTEtest as (select top 100)) and its been going for an hour and a half. Is this because it still needs to run through the whole thing before selecting the top 100? Or should I assume that if this query is taking 2 hours for 100 records, it will take days for 4.5 million? How can I optimize this?
Is there any way to see how much time is remaining on the query?
I think you are better off doing the running sum as a correlated subquery. This will allow you to better manage indexes for performance:
With this structure, an index on
txn_by_month(memberid, accountid, balance, netamt)should be able to satisfy this part of the query, without going back to the original data.