As you know, with an option CORRB, you can let logistic regression or linear regression in SAS to output a correlations of estimates matrix. Interestingly, I am not sure how to read this matrix. I have two variables which are clearly strongly positive correlated. From PROC CORR, I can see the pearson correlation coefficient of these two variables is 0.7+. But the estimates matrix from both logitistic regression and linear regression give me -0.7. The strengh of the correlation is about similar but the sign is reversed. Anyone can explain it? Many thanks.
As you know, with an option CORRB , you can let logistic regression or
Share
You are reading the values correctly, they just mean different things. PROC CORR gives you the correlation of the variables, while CORRB is the correlation of the coefficients of these variables in the model.
Here is an intuitive explanation of why positively correlated predictors will have negatively correlated coefficients. Suppose
y = a + b1*x1 + b2*x2 + eps. If you increaseb1a little from its best value obtained from the regression, then the predicted value forywill also increase (for positivex1) and will make the overall fit worse. One way to compensate for that and move the predicted values closer to the observed ones is to decreaseb2: since high values ofx1are associated with high values ofx2, you will get back close to the original fit. This shows that the uncertainty inb2is negatively correlated with the uncertainty inb1: increasing one while decreasing the other will lead to similar fits.It might be instructive to look at the extreme case of perfect correlation:
x2=x1. Then the following will give you exactly the same predictions:So
b2 = 5-b1and the coefficients have a perfect negative correlation.