Why are lm and biglm producing different estimates? Consider the code below: a =

Question

0

Asked: May 26, 20262026-05-26T04:31:28+00:00 2026-05-26T04:31:28+00:00

Why are lm and biglm producing different estimates? Consider the code below: a =

0

Why are “lm” and “biglm” producing different estimates? Consider the code below:

a = as.data.frame(cbind(y=rnorm(1000000), x1=rnorm(1000000), x2=rnorm(1000000)))
m1 = lm(y ~ x1 + x2, data=a); summary(m1)

library(biglm)
m2 = biglm(y ~ x1 + x2, data=a); summary(m2)

It makes no difference if biglm processes in chunks or not – the final estimates are different from that produced by lm.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T04:31:28+00:00

Posting as answer simply due to length:

m2$qr

$D
[1] 1.000000e+06 1.001150e+06 9.993772e+05

$rbar
[1] -8.581350e-04 -8.116662e-04 -1.225233e-03  

$thetab
[1]  7.863159e-04 -4.276900e-04 -1.552812e-03   # these are the coefficients

Rgames: m1$coefficients
  (Intercept)            x1            x2 
 7.846869e-04 -4.295926e-04 -1.552812e-03

So, yes, the coefficients are slightly different. For example, the intercepts differ by 0.2% . Whether this sort of difference has any effect on the quality of your fitted line depends rather a lot on what you intend to do with your fit. Integration? guaranteed no problem. Extrapolation? always risky, but not because the slopes differ by 0.5% .
I would strongly recommend that at the very least you run some test cases which fit, say
f(x) = g(x) +runif(N) ; h(x)= g(x) +runif(N) #runif will return different sets of RVs

,and see if lm and biglm return significantly different coefficients from the original g(x) values.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Why are lm and biglm producing different estimates? Consider the code below: a =

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply