I have a database with a table, storing changes in account-balance across a couple of accounts, with three columns;
float balance, #The account balance after the change Date date, #Date that balance change occurred int aid #Account that the balance change occurred on
It contains a couple of entries for each day of the year, and I want to retrieve the balance of every five days. I also want it to separate between accounts (ie if two changes occurred on the same day, but on separate accounts, return both).
The problem is this: Sometimes there will be several days (or weeks) where there is no data available. When that occurs, I want to make sure to return the latest entry before the ‘hole’ in the dataset. This is a simplified version of the problem, the actual database is big (several gigabytes), the size is the reason why I want to return a subset of the data. It cannot use platform specific methods, because it needs to work on both oracle and mySQL.
My question is: Is there any way to do this fast? I would be able to write a query that gets the job done, but I am hoping there is some devil magic way of doing it that does not require lots of nested queries and aggregate functions..
I would use Andomar’s Period table idea, but I would try a slightly different final query. This assumes that your Account_Balances table has a PK on aid and date. If you ended up with two balances for the same account for the same exact date and time then you would get some duplicate rows.
If the account has no rows before or during the given period you will not get a row back for it.