I have an application that works out the total cost of usage …
Usage Table
terminal calldate callduration tariff
1 2011-01-01 00:01:00 1
1 2011-01-01 00:03:30 1
1 2011-01-02 00:10:00 1
1 2011-01-05 01:00:00 1
Tariff Table
id included extraunit extracost
1 00:05:00 00:01:00 10
Use the above tables I work out the total cost per terminal per month, first I convert all times to seconds and then use the following formula to calculate the cost
cost = ((totalduration - included) / extraunit ) * extracost
695 = ((4470 - 300) / 60 ) * 10
if totalduration - included is negative the cost will be 0
I do the calculation in PHP. This all works fine – i can create an invoice per terminal and the costs are show per terminal per month.
Month Terminal Cost
2011-01 1 695
What I now need to do is add a column to the original usage table (i may use a separate table – but keeping it simple for this question) to record the cost of each record within the usage table.
So – I know the total cost per month and can query the usage table to get all the records that make up that cost – what I cannot figure out is how I allocate a cost while taking the included time into account. If there was no included time I would divide the total cost by the total duration and then work out how much each record costs …. I can do that by working out the cost per second 695 / 4170 = 0.1666666667 – I then work out the cost per record. This is the output I am trying to achieve.
terminal calldate callduration tariff cost
1 2011-01-01 00:01:00 1 0.00
1 2011-01-01 00:03:30 1 0.00
1 2011-01-02 00:10:00 1 95.00 (rounded to 2 decimal places)
1 2011-01-05 01:00:00 1 600.00 (rounded to 2 decimal places)
Line 1 user 60 seconds from the included minutes, line 2 uses 210 seconds, and line 3 uses 30 seconds of the 5 minutes included – the remainder of line 3 (570 seconds) is calculated.
Can someone please point me in the right direction – I don’t think this is going to be possible to do purely in MySQL … but it would be very nice if I could at least get only the records that need a cost applying – ie ignoring the included records.
The usage table contains 80 million records per month (there are 800,000 terminals) – so it needs to scale quite well …
Update
The process for creating the total cost per terminal each month takes approx 3 hours – to keep this time down I group the call durations using – sum(time_to_sec(callduration)) this allows me to get the total cost per month per terminal. This cost is the number one importance and should be available as soon as possible – the cost per usage is not as urgent and is required “some time” after – this is why i can work it out using a separate process – maybe its better to revisit that original plan ? and maybe work out the cost per usage line and then total these up ?
Update 2
Added some expected output and some more detail on how the cost per record is calculated – I have all of the calculations – thats the straight forward part – the difficulty is working out what is included and what is not …
Any help / advice is appreciated.
The cost per usage is going to be hard to calculate and to get it to sum and match the overall cost if the overall cost is calculated on an aggregate basis.
For example
What happens here? Does this include 30s of included time? If so how is the other 9m30s broken up given your units are 1m? is it calculated pro-rata?
In terms of the monthly summary I would maybe build it up day by day per terminal so you would get something like this
Then each day you can sum up the time allocated per terminal the day before and add a new line with the updated calculation.
EDIT
I believe you can do this in pure sql. This should get you started:
From that, you should either be able to add a cumulative duration column and work from that or you can probably stitch it into an UPDATE and just log the actual cost per usage using a CASE statement.
EDIT
Ok, given your comments I think this works if you want to do it in pure SQL. It requires 2 extra columns (cumul_duration and cost).
I haven’t worked out how to do it all in one query and I haven’t tested it against more than one tariff but it works on your sample data.