I want to minimize a simple linear function Y = x1 + x2 + x3 + x4 + x5 using ordinary least squares with the constraint that the sum of all coefficients have to equal 5. How can I accomplish this in R? All of the packages I’ve seen seem to allow for constraints on individual coefficients, but I can’t figure out how to set a single constraint affecting coefficients. I’m not tied to OLS; if this requires an iterative approach, that’s fine as well.
I want to minimize a simple linear function Y = x1 + x2 +
Share
The basic math is as follows: we start with
and we want to find
a0–a4to minimize the SSQ betweenmuand our response variabley.if we replace the last parameter (say
a4) with (say)C-a1-a2-a3to honour the constraint, we end up with a new set of linear equations(note that
a4has disappeared …)Something like this (untested!) implements it in R.
Original data frame:
Create a transformed version where all but the last column have the last column “swept out”, e.g.
x1 -> x1-x4; x2 -> x2-x4; ...Rename to
tx1,tx2, … to minimize confusion:Sum-of-coefficients constraint:
Now fit the model with an offset:
It wouldn’t be too hard to make this more general.
This requires a little more thought and manipulation than simply specifying a constraint to a canned optimization program. On the other hand, (1) it could easily be wrapped in a convenience function; (2) it’s much more efficient than calling a general-purpose optimizer, since the problem is still linear (and in fact one dimension smaller than the one you started with). It could even be done with big data (e.g.
biglm). (Actually, it occurs to me that if this is a linear model, you don’t even need the offset, although using the offset means you don’t have to computea0=intercept-C*x4after you finish.)