I’m trying to work out the most efficient method to find the linear regression equation (y = mx + c) for a dataset, given a 2 by n array.
Basically I want to know what the value of Y is when X is, for example, 50.
My current method leaves a lot to be desired:
inputData is my 2 by n array, with X in the first column and Y in the second.
x = 50
for i = 1 : size(inputData,1) % for every line in the inputData array
if (inputData(i,1) < x + 5) | (inputData(i,1) > x - 5) % if we're within 5 of the specified X value
arrayOfCloseYValues(i) = inputData(i, 2); % add the other position to the array
end
end
y = mean(arrayOfCloseYValues) % take the mean to find Y
As you can see, my above method simply tries to find values of Y that are within 5 of the given X value and gets the mean. This is a terrible method, plus it takes absolutely ages to process.
What I really need is a robust method for calculating the linear regression for X and Y, so that I can find the value through the equation y = mx + c…
PS. In my above method I do actually pre-allocate memory and remove trailing zeros at the end, but I have removed this part for simplicity.
Polyfit is fine, but I think you’re problem is a bit simpler. You have a 2 x n array of data. Let’s say column 1 is y and column 2 is x, then:
Should give you a least squares regression for the slope and offset.
Here’s another way to test it:
Should get you:
