Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7695023
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T21:27:07+00:00 2026-05-31T21:27:07+00:00

I have 2 data frames. One is training data ( pubs1 ), the other

  • 0

I have 2 data frames. One is training data (pubs1), the other (pubs2) test data. I can create a linear regression object but am unable to create a prediction. This is not my first time doing this and can’t figure out what is going wrong.

> head(pubs1 )
  id   pred37   actual     weight       diff1   weightDiff1    pred1    pred2    pred3    pred4
1 11 128.3257 128.3990 6.43482732 -0.07333650 -0.4719076922 126.3149 126.1024 126.9057 126.2718
2 31 100.8822 100.9777 3.55520287 -0.09553741 -0.3396548680 100.7820 100.8589 100.9179 100.8903
3 33 100.7204 100.9630 7.46413438 -0.24262409 -1.8109787866 100.8576 100.8434 100.8521 100.8914
4 52 100.8564 100.9350 0.01299138 -0.07855588 -0.0010205495 100.8700 100.8925 100.8344 100.8714
5 56 100.8410 100.9160 0.01299138 -0.07502125 -0.0009746298 100.8695 100.8889 100.8775 100.8871
6 71 100.8889 100.8591 1.19266269  0.02979818  0.0355391800 100.8357 100.9205 100.8107 100.8316
> head(pubs2 )
      id    pred37     pred1    pred2     pred3     pred4
1 762679  98.32212  97.84181  98.0776  98.03222  97.90022
2 762680 115.79698 114.91411 115.1470 115.27129 115.45027
3 762681 104.56418 104.81372 104.8537 104.66239 104.55240
4 762682 106.65768 106.71011 106.6722 106.68662 106.60757
5 762683 102.15662 103.14207 103.2035 103.31190 103.40397
6 762684 101.96057 102.25939 102.1031 102.20659 102.04557

> lm1 <- lm(pubs1$actual ~ pubs1$pred37 + pubs1$pred1 + pubs1$pred2 
+ + pubs1$pred3 + pubs1$pred4)
> summary(lm1)

Call:
lm(formula = pubs1$actual ~ pubs1$pred37 + pubs1$pred1 + pubs1$pred2 + 
    pubs1$pred3 + pubs1$pred4)

Residuals:
     Min       1Q   Median       3Q      Max 
-18.3415  -0.2309   0.0016   0.2236  17.8639 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.122478   0.027227  -4.498 6.85e-06 ***
pubs1$pred37  0.543270   0.005086 106.823  < 2e-16 ***
pubs1$pred1   0.063680   0.007151   8.905  < 2e-16 ***
pubs1$pred2   0.317768   0.010977  28.950  < 2e-16 ***
pubs1$pred3   0.024302   0.008321   2.921  0.00349 ** 
pubs1$pred4   0.052183   0.010879   4.797 1.61e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.7298 on 99994 degrees of freedom
Multiple R-squared: 0.9932,     Adjusted R-squared: 0.9932 
F-statistic: 2.926e+06 on 5 and 99994 DF,  p-value: < 2.2e-16 

>

 > pred2 <- predict(lm1, pubs2)
Warning message:
'newdata' had 50000 rows but variable(s) found have 100000 rows

> str(pubs1)
'data.frame':   100000 obs. of  10 variables:
 $ id         : num  11 31 33 52 56 71 85 87 92 95 ...
 $ pred37     : num  128 101 101 101 101 ...
 $ actual     : num  128 101 101 101 101 ...
 $ weight     : num  6.435 3.555 7.464 0.013 0.013 ...
 $ diff1      : num  -0.0733 -0.0955 -0.2426 -0.0786 -0.075 ...
 $ weightDiff1: num  -0.471908 -0.339655 -1.810979 -0.001021 -0.000975 ...
 $ pred1      : num  126 101 101 101 101 ...
 $ pred2      : num  126 101 101 101 101 ...
 $ pred3      : num  127 101 101 101 101 ...
 $ pred4      : num  126 101 101 101 101 ...
> str(pubs2)
'data.frame':   50000 obs. of  6 variables:
 $ id    : num  762679 762680 762681 762682 762683 ...
 $ pred37: num  98.3 115.8 104.6 106.7 102.2 ...
 $ pred1 : num  97.8 114.9 104.8 106.7 103.1 ...
 $ pred2 : num  98.1 115.1 104.9 106.7 103.2 ...
 $ pred3 : num  98 115 105 107 103 ...
 $ pred4 : num  97.9 115.5 104.6 106.6 103.4 ...
> colnames(pubs1)
 [1] "id"          "pred37"      "actual"      "weight"      "diff1"       "weightDiff1" "pred1"       "pred2"       "pred3"       "pred4"      
> colnames(pubs2)
[1] "id"     "pred37" "pred1"  "pred2"  "pred3"  "pred4" 

Is there anything here that I’m missing?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T21:27:08+00:00Added an answer on May 31, 2026 at 9:27 pm

    Instead of,

    lm1 <- lm(pubs1$actual ~ pubs1$pred37 + pubs1$pred1 + pubs1$pred2 
              pubs1$pred3 + pubs1$pred4)
    

    try,

    lm1 <- lm(actual ~ pred37 + pred1 + pred2 
              pred3 + pred4, data = pubs1)
    

    Otherwise predict.lm will be looking for variables called pubs1$pred37 in your new data frame.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have two data.frames, one with only characters and the other one with characters
I have a dataframe with numeric entries like this one test <- data.frame(x =
I have the following matching problem: I have two data.frames, one with an observation
I have two data frames. The one consists of three variables, namely date, strike
I have 2 frames. One contains buttons that do appropriate actions on data. And
I am trying to merge several data.frames into one data.frame . Since I have
I have two data frames. One contains a large amount of data. The second
I have two R data frame with differing dimensions. However but data frames have
I have two threads, one for data acquisition and the other one for display.
I have two data.frames in R, one of which has two columns and of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.