so I have a matrix Data in this format:
Data = [Date Time Price]
Now what I want to do is plot the Price against the Time, but my data is very large and has lines where there are multiple Prices for the same Date/Time, e.g. 1st, 2nd lines
29 733575.459548611 40.0500000000000
29 733575.459548611 40.0600000000000
29 733575.459548612 40.1200000000000
29 733575.45954862 40.0500000000000
I want to take an average of the prices with the same Date/Time and get rid of any extra lines. My goal is to do linear intrapolation on the values which is why I must have only one Time to one Price value.
How can I do this? I did this (this reduces the matrix so that it only takes the first line for the lines with repeated date/times) but I don’t know how to take the average
function [ C ] = test( DN )
[Qrows, cols] = size(DN);
C = DN(1,:);
for i = 1:(Qrows-1)
if DN(i,2) == DN(i+1,2)
%n = 1;
%while DN(i,2) == DN(i+n,2) && i+n<Qrows
% n = n + 1;
%end
% somehow take average;
else
C = [C;DN(i+1,:)];
end
end
If you use as input
Aonly the columns you do not want to average over (here: date & time),icwith one value for every row where rows you want to combine have the same value.Getting from there to the means you want is for MATLAB beginners probably more intuitive with a
forloop: Use logical indexing, e.g.DN(ic==n,3)you get a vector of all values you want to average (wherenis the index of the date-time-row it belongs to). This you need to do for all different date-time-combinations.A more vector-oriented way would be to use
accumarray, which leads to a solution of your problem in two lines:I’m not quite sure how you want the result to look like, but
[DataAndTime Price]gives you the three-row format of the input again.Note that if your input contains something like:
then the result of applying
unique(...,'rows')to the input before the above lines will give a different result for1 0.1than using the above directly, as the latter would calculate the mean of 23, 23 and 42, while in the former case one 23 would be eliminates as duplicate before and the differing row with 42 would have a greater weight in the average.