I’ve got an struct array, with three fields – an array, the array’s length, and a number.
N = 5;
data = struct;
for i=1:N
n = ceil(rand * 3);
data(i).len = n;
data(i).array = rand(1,n);
data(i).number = i;
end
The data looks like this:
data =
1x5 struct array with fields:
len = [ 1 3 3 1 1 ]
array = [[0.8]; [0.7 0.9 0.4]; [0.7 0 0.3]; [0.1]; [0.3]]
number = [ 1 2 3 4 5 ]
I can return array as a 1×9 array in several ways:
>>> [data.array]
>>> cat(2,data.array)
[0.8 | 0.7 0.9 0.4 | 0.7 0 0.3 | 0.1 | 0.3] % | shows array separation
I’d like to repeat the number (data.number) len times, to produce the same length array as the concatenated array.
I’m currently doing this with arrayfun then cell2mat:
>> x = arrayfun(@(x) repmat(x.number, 1, x.len), data, 'UniformOutput', false)
x =
[1] [1x3 double] [1x3 double] [4] [5]
>> cell2mat(x)
[ 1 2 2 2 3 3 3 4 5]
This makes the numbers line up with the arrays.
arrays = [ 0.8 | 0.7 0.9 0.4 | 0.7 0 0.3 | 0.1 | 0.3 ]
numbers = [ 1 | 2 2 2 | 3 3 3 | 4 | 5 ]
The idea behind this is to feed the data to the GPU for processing – but rearranging the data takes orders of magnitude longer than the actual processing.
Arrayfun takes ~5 seconds when N=100,000, and a for loop calling repmat takes ~4 seconds.
Is there a faster way to rearrange data from uneven arrays in structures into matching length 1d arrays? I’m open to using a different data structure.
Testing vectorised method:
data = struct;
data(1).len = 1;
data(1).array = [1 2 3];
data(1).number = 11;
data(2).len = 0;
data(2).array = [];
data(2).number = 12;
data(3).len = 2;
data(3).array = [4 5 6; 7 8 9];
data(3).number = 13;
list_of_array = cat(1,data.array)
idx = zeros(1,size(list_of_array,1));
% Set start of each array to 1
len = cumsum([data.len])
idx(len) = 1
% Flat indices
idx = cumsum([1 idx(1:end-1)])
nf = [data.number]
repeated_num_faces = nf(idx)
Gives the output:
list_of_array =
1 2 3
4 5 6
7 8 9
len =
1 1 3 % Cumulative lengths
idx =
1 0 1 % Ones at start
idx =
1 2 2 % Flat indexes - should be [1 3 3]
nf =
11 12 13 % Numbers expanded
repeated_num_faces =
11 12 12 % Wrong .numbers - should be [11 13 13]
Well,
structis not the easiest to deal with here. Definitely, you should not userepmat. Rather than that, preallocate thedata_numberarray and do aforloop:Here is another ‘vectorized’ solution using
cumsumto mark the indices in the ‘flat’ vectorFor a data set of
N=1e5the times are: