In analysis of energy demand and consumption data, I’m having issue re-sampling and interpolating time series trended data.
Data set example:
timestamp value kWh
------------------ ---------
12/19/2011 5:43:21 PM 79178
12/19/2011 5:58:21 PM 79179.88
12/19/2011 6:13:21 PM 79182.13
12/19/2011 6:28:21 PM 79183.88
12/19/2011 6:43:21 PM 79185.63
Based upon these observations, I’d like some aggregation to roll-up values based upon a period of time, with that frequency set to a unit of time.
As in, intervals on the hour filling any gaps of missing data
timestamp value (approx)
------------------ ---------
12/19/2011 5:00:00 PM 79173
12/19/2011 6:00:00 PM 79179
12/19/2011 7:00:00 PM 79186
For a linear algorithm, it seems I would take the difference in time and multiply the value against that factor.
TimeSpan ts = current - previous;
Double factor = ts.TotalMinutes / period;
Value and timestamp could be calculated based upon the factor.
With such quantity of available information, I’m unsure why it’s difficult to find the most elegant approach to this.
Perhaps first, are there open source analysis libraries that could be recommended?
Any recommendations for a programmatic approach? Ideally C#, or possibly with SQL?
Or, any similar questions (with answers) I could be pointed to?
By using the time-ticks that are used internally to represent DateTimes, you get the most accurate values that are possible. Since these time ticks do not restart at zero at midnight, you will not have problems at day boundaries.
Explanation of the formula
In geometry, similar triangles are triangles that have the same shape but different sizes. The formula above is based on the fact that the ratios of any two sides in one triangle are the same for the corresponding sides of a similar triangle.
If you have a triangle A B C and a similar triangle a b c, then
A : B = a : b. The equality of two ratios is called a proportion.We can apply this proportionality rule to our problem:
We can multiply both sides of the equation above by
(tf – t0):(tf – t0) * (e1 – e0) / (t1 – t0) = (ef – e0)By adding
e0on both sides we can isolateef:e0 + (tf – t0) * (e1 – e0) / (t1 – t0) = efLet’s swap the two sides:
ef = e0 + (tf – t0) * (e1 – e0) / (t1 – t0)That’s it. The way we calculated the intermediate value is called "linear interpolation".