Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3216626
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T15:18:59+00:00 2026-05-17T15:18:59+00:00

I want to run a two stage probit least square regression in R. Does

  • 0

I want to run a two stage probit least square regression in R. Does anyone know how to do this? Is there any package out there? I know it’s possible to do it using Stata, so I imagine it’s possible to do it with R.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T15:19:00+00:00Added an answer on May 17, 2026 at 3:19 pm

    You might want to be more specific when you say ‘two-stage-probit-least-squares’. Since you refer to a Stata program that implements this I am guessing you are talking about the CDSIMEQ package, which implements the Amemiya (1978) procedure for the Heckit model (a.k.a Generalized Tobit, a.k.a. Tobit type II model, etc.). As Grant said, systemfit will do a Tobit for you, but not with two equations. The MicEcon package did have a Heckit (but the package has split so many times I don’t know where it is now).

    If you want what the CDSIMEQ does, it can easily be implemented in R. I wrote a function that replicates CDSIMEQ:

    tspls <- function(formula1, formula2, data) {
        # The Continous model
        mf1 <- model.frame(formula1, data)
        y1 <- model.response(mf1)
        x1 <- model.matrix(attr(mf1, "terms"), mf1)
    
        # The dicontionous model
        mf2 <- model.frame(formula2, data)
        y2 <- model.response(mf2)
        x2 <- model.matrix(attr(mf2, "terms"), mf2)
    
        # The matrix of all the exogenous variables
        X <- cbind(x1, x2)
        X <- X[, unique(colnames(X))]
    
        J1 <- matrix(0, nrow = ncol(X), ncol = ncol(x1))
        J2 <- matrix(0, nrow = ncol(X), ncol = ncol(x2))
        for (i in 1:ncol(x1)) J1[match(colnames(x1)[i], colnames(X)), i] <- 1
        for (i in 1:ncol(x2)) J2[match(colnames(x2)[i], colnames(X)), i] <- 1
    
        # Step 1:
        cat("\n\tNOW THE FIRST STAGE REGRESSION")
        m1 <- lm(y1 ~ X - 1)
        m2 <- glm(y2 ~ X - 1, family = binomial(link = "probit"))
        print(summary(m1))
        print(summary(m2))
    
        yhat1 <- m1$fitted.values
        yhat2 <- X %*% coef(m2)
    
        PI1 <- m1$coefficients
        PI2 <- m2$coefficients
        V0 <- vcov(m2)
        sigma1sq <- sum(m1$residuals ^ 2) / m1$df.residual
        sigma12 <- 1 / length(y2) * sum(y2 * m1$residuals / dnorm(yhat2))
    
        # Step 2:
        cat("\n\tNOW THE SECOND STAGE REGRESSION WITH INSTRUMENTS")
    
        m1 <- lm(y1 ~ yhat2 + x1 - 1)
        m2 <- glm(y2 ~ yhat1 + x2 - 1, family = binomial(link = "probit"))
        sm1 <- summary(m1)
        sm2 <- summary(m2)
        print(sm1)
        print(sm2)
    
        # Step  3:
        cat("\tNOW THE SECOND STAGE REGRESSION WITH CORRECTED STANDARD ERRORS\n\n")
        gamma1 <- m1$coefficients[1]
        gamma2 <- m2$coefficients[1]
    
        cc <- sigma1sq - 2 * gamma1 * sigma12
        dd <- gamma2 ^ 2 * sigma1sq - 2 * gamma2 * sigma12
        H <- cbind(PI2, J1)
        G <- cbind(PI1, J2)
    
        XX <- crossprod(X)                          # X'X
        HXXH <- solve(t(H) %*% XX %*% H)            # (H'X'XH)^(-1)
        HXXVXXH <- t(H) %*% XX %*% V0 %*% XX %*% H  # H'X'V0X'XH
        Valpha1 <- cc * HXXH + gamma1 ^ 2 * HXXH %*% HXXVXXH %*% HXXH
    
        GV <- t(G) %*% solve(V0)    # G'V0^(-1)
        GVG <- solve(GV %*% G)      # (G'V0^(-1)G)^(-1)
        Valpha2 <- GVG + dd * GVG %*% GV %*% solve(XX) %*% solve(V0) %*% G %*% GVG
    
        ans1 <- coef(sm1)
        ans2 <- coef(sm2)
    
        ans1[,2] <- sqrt(diag(Valpha1))
        ans2[,2] <- sqrt(diag(Valpha2))
        ans1[,3] <- ans1[,1] / ans1[,2]
        ans2[,3] <- ans2[,1] / ans2[,2]
        ans1[,4] <- 2 * pt(abs(ans1[,3]), m1$df.residual, lower.tail = FALSE)
        ans2[,4] <- 2 * pnorm(abs(ans2[,3]), lower.tail = FALSE)
    
        cat("Continuous:\n")
        print(ans1)
        cat("Dichotomous:\n")
        print(ans2)
    }
    

    For comparison, we can replicate the sample from the author of CDSIMEQ in their article about the package.

    > library(foreign)
    > cdsimeq <- read.dta("http://www.stata-journal.com/software/sj3-2/st0038/cdsimeq.dta")
    > tspls(continuous ~ exog3 + exog2 + exog1 + exog4,
    +     dichotomous ~ exog1 + exog2 + exog5 + exog6 + exog7,
    +     data = cdsimeq)
    
            NOW THE FIRST STAGE REGRESSION
    Call:
    lm(formula = y1 ~ X - 1)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -1.885921 -0.438579 -0.006262  0.432156  2.133738 
    
    Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
    X(Intercept)  0.010752   0.020620   0.521 0.602187    
    Xexog3        0.158469   0.021862   7.249 8.46e-13 ***
    Xexog2       -0.009669   0.021666  -0.446 0.655488    
    Xexog1        0.159955   0.021260   7.524 1.19e-13 ***
    Xexog4        0.316575   0.022456  14.097  < 2e-16 ***
    Xexog5        0.497207   0.021356  23.282  < 2e-16 ***
    Xexog6       -0.078017   0.021755  -3.586 0.000352 ***
    Xexog7        0.161177   0.022103   7.292 6.23e-13 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    Residual standard error: 0.6488 on 992 degrees of freedom
    Multiple R-squared: 0.5972,     Adjusted R-squared: 0.594 
    F-statistic: 183.9 on 8 and 992 DF,  p-value: < 2.2e-16 
    
    
    Call:
    glm(formula = y2 ~ X - 1, family = binomial(link = "probit"))
    
    Deviance Residuals: 
         Min        1Q    Median        3Q       Max  
    -2.49531  -0.59244   0.01983   0.59708   2.41810  
    
    Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
    X(Intercept)  0.08352    0.05280   1.582 0.113692    
    Xexog3        0.21345    0.05678   3.759 0.000170 ***
    Xexog2        0.21131    0.05471   3.862 0.000112 ***
    Xexog1        0.45591    0.06023   7.570 3.75e-14 ***
    Xexog4        0.39031    0.06173   6.322 2.57e-10 ***
    Xexog5        0.75955    0.06427  11.818  < 2e-16 ***
    Xexog6        0.85461    0.06831  12.510  < 2e-16 ***
    Xexog7       -0.16691    0.05653  -2.953 0.003152 ** 
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    (Dispersion parameter for binomial family taken to be 1)
    
        Null deviance: 1386.29  on 1000  degrees of freedom
    Residual deviance:  754.14  on  992  degrees of freedom
    AIC: 770.14
    
    Number of Fisher Scoring iterations: 6
    
    
            NOW THE SECOND STAGE REGRESSION WITH INSTRUMENTS
    Call:
    lm(formula = y1 ~ yhat2 + x1 - 1)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -2.32152 -0.53160  0.04886  0.53502  2.44818 
    
    Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
    yhat2         0.257592   0.021451  12.009   <2e-16 ***
    x1(Intercept) 0.012185   0.024809   0.491    0.623    
    x1exog3       0.042520   0.026735   1.590    0.112    
    x1exog2       0.011854   0.026723   0.444    0.657    
    x1exog1       0.007773   0.028217   0.275    0.783    
    x1exog4       0.318636   0.028311  11.255   <2e-16 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    Residual standard error: 0.7803 on 994 degrees of freedom
    Multiple R-squared: 0.4163,     Adjusted R-squared: 0.4128 
    F-statistic: 118.2 on 6 and 994 DF,  p-value: < 2.2e-16 
    
    
    Call:
    glm(formula = y2 ~ yhat1 + x2 - 1, family = binomial(link = "probit"))
    
    Deviance Residuals: 
         Min        1Q    Median        3Q       Max  
    -2.49610  -0.58595   0.01969   0.59857   2.41281  
    
    Coefficients:
                  Estimate Std. Error z value Pr(>|z|)    
    yhat1          1.26287    0.16061   7.863 3.75e-15 ***
    x2(Intercept)  0.07080    0.05276   1.342 0.179654    
    x2exog1        0.25093    0.06466   3.880 0.000104 ***
    x2exog2        0.22604    0.05389   4.194 2.74e-05 ***
    x2exog5        0.12912    0.09510   1.358 0.174544    
    x2exog6        0.95609    0.07172  13.331  < 2e-16 ***
    x2exog7       -0.37128    0.06759  -5.493 3.94e-08 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
    
    (Dispersion parameter for binomial family taken to be 1)
    
        Null deviance: 1386.29  on 1000  degrees of freedom
    Residual deviance:  754.21  on  993  degrees of freedom
    AIC: 768.21
    
    Number of Fisher Scoring iterations: 6
    
            NOW THE SECOND STAGE REGRESSION WITH CORRECTED STANDARD ERRORS
    
    Continuous:
                    Estimate Std. Error    t value   Pr(>|t|)
    yhat2         0.25759209  0.1043073 2.46955009 0.01369540
    x1(Intercept) 0.01218500  0.1198713 0.10165068 0.91905445
    x1exog3       0.04252006  0.1291588 0.32920764 0.74206810
    x1exog2       0.01185438  0.1290754 0.09184073 0.92684309
    x1exog1       0.00777347  0.1363643 0.05700519 0.95455252
    x1exog4       0.31863627  0.1367881 2.32941597 0.02003661
    Dichotomous:
                     Estimate Std. Error    z value     Pr(>|z|)
    yhat1          1.26286574  0.7395166  1.7076909 0.0876937093
    x2(Intercept)  0.07079775  0.2666447  0.2655134 0.7906139867
    x2exog1        0.25092561  0.3126763  0.8025092 0.4222584495
    x2exog2        0.22603717  0.2739307  0.8251618 0.4092797527
    x2exog5        0.12911922  0.4822986  0.2677163 0.7889176766
    x2exog6        0.95609385  0.2823662  3.3860070 0.0007091758
    x2exog7       -0.37128221  0.3265478 -1.1369920 0.2555416141
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I just want to find out how to run two WPF windows (one in
Why this code don't work,when i want run this code vwd 2008 express show
I want to run two commands in a Windows CMD console. In Linux I
Is there any way I can have thread-safe coroutines along with boost::asio? I want
I want to use two MATLAB sessions on the same machine to run two
I want run a script as follows: runner: ssh 'java program &' ssh 'java
I want to run a weekly batch process in an asp.net page. How can
I want to run a web application on php and mysql, using the CakePHP
I want to run javascript/Python/Ruby inside my application. I have an application that creates
I want to run a command as soon as a certain text appears in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.