Possible Duplicate:
Specifying formula in R with glm without explicit declaration of each covariate
how to succinctly write a formula with many variables from a data frame?
I have a vector of Y values and a matrix of X values that I want to perform a multiple regression on (i.e. Y = X[column 1] + X[column 2] + … X[column N])
The problem is that the number of columns in my matrix (N) is not prespecified. I know in R, to perform a linear regression you have to specify the equation:
fit = lm(Y~X[,1]+X[,2]+X[,3])
But how do I do this if I don’t know how many columns are in my X matrix?
Thanks!
Three ways, in increasing level of flexibility.
Method 1
Run your regression using the formula notation:
Method 2
Put all your data in one data.frame, not two:
Then run your regression using the formula notation:
Method 3
Another way is to build the formula yourself:
In this example, xvars is a character vector containing the names of the variables you want to use.