I have a multi-dimensional array similar to the example below that I want to group together using Ruby’s zip method. I have it working fine when each inner array has the same number of elements, but am running into problems when they are different lengths.
In the example below, the second set is missing a record at 00:15. How would I fill in this missing record?
What am I considering a gap?
It’s the timestamp that constitutes a
gap. Take a look at my first code
sample where I have a comment about
the gap being at 00:15. All the other
arrays have a hash with this
timestamp, so I consider this to be a
“missing record” or “gap”. The
timestamp really could be some other
unique string so the fact that they
are 15 minutes apart is irrelevant.
The values are also irrelevant.
The only approach that comes to mind involves looping over the arrays twice. The first time would be to build an array of uniq timestamps, and the second time would be to fill in the missing record(s) where the timestamp are not present. I’m comfortable coding this approach, but it seems a little hacky and Ruby always seems to surprise me with an elegant and concise solution.
I start with this:
values = [
[
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:15", :value => 2},
{:timestamp => "2011-01-01 00:30", :value => 3}
],
[ # There's a gap here at 00:15
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:30", :value => 3}
],
[
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:15", :value => 2},
{:timestamp => "2011-01-01 00:30", :value => 3}
]
]
I want to end with this:
values = [
[
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:15", :value => 2},
{:timestamp => "2011-01-01 00:30", :value => 3}
],
[ # The gap has been filled with a nil value
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:15", :value => nil},
{:timestamp => "2011-01-01 00:30", :value => 3}
],
[
{:timestamp => "2011-01-01 00:00", :value => 1},
{:timestamp => "2011-01-01 00:15", :value => 2},
{:timestamp => "2011-01-01 00:30", :value => 3}
]
]
When all the arrays are the same size, values.transpose will produce:
[
[
{:value=>1, :timestamp=>"2011-01-01 00:00"},
{:value=>1, :timestamp=>"2011-01-01 00:00"},
{:value=>1, :timestamp=>"2011-01-01 00:00"}
],
[
{:value=>2, :timestamp=>"2011-01-01 00:15"},
{:value=>nil, :timestamp=>"2011-01-01 00:15"},
{:value=>2, :timestamp=>"2011-01-01 00:15"}
],
[
{:value=>3, :timestamp=>"2011-01-01 00:30"},
{:value=>3, :timestamp=>"2011-01-01 00:30"},
{:value=>3, :timestamp=>"2011-01-01 00:30"}
]
]
The approach you outlined is correct, but it turns out ruby is very well suited for doing that kind of approach elegantly. This would do it, for example:
The first line gets a list of unique timestamps (maps all the logs into just arrays of timestamps, flattens the arrays into a single array, keeps only uniques, and sorts the timestamps).
The second line fills in the gaps (loops through the logs, and for each timestamp in that log use what’s there if there’s something there, otherwise insert the new nil-valued row).