Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6990245
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T19:19:20+00:00 2026-05-27T19:19:20+00:00

I am working on a logistic regression model with one continuous predictor and one

  • 0

I am working on a logistic regression model with one continuous predictor and one categorical predictor with several levels. I want to present the results using ggplot2 and exploiting the facet_wrap to show the regression lines for each level of the categorical predictor. When doing this I noticed the fitted curve provided by stat_smooth only considers the data in a particular facet, not the whole data set. This is a small difference, but a noticeable one when looking at the plot versus predicted values returned from predict.glm.

Here is an example recreating the issue with the graphic following the code.

library(boot)    # needed for inv.logit function
library(ggplot2) # version 0.8.9

set.seed(42)
n <- 100

df <- data.frame(location = rep(LETTERS[1:4], n),
                 score    = sample(45:80, 4*n, replace = TRUE))

df$p    <- inv.logit(0.075 * df$score + rep(c(-4.5, -5, -6, -2.8), n))
df$pass <- sapply(df$p, function(x){rbinom(1, 1, x)}) 

gplot <- ggplot(df, aes(x = score, y = pass)) + 
            geom_point() + 
            facet_wrap( ~ location) + 
            stat_smooth(method = 'glm', family = 'binomial') 

# 'full' logistic model
g <- glm(pass ~ location + score, data = df, family = 'binomial')
summary(g)

# new.data for predicting new observations
new.data <- expand.grid(score    = seq(46, 75, length = n), 
                        location = LETTERS[1:4])

new.data$pred.full <- predict(g, newdata = new.data, type = 'response')

pred.sub <- NULL
for(i in LETTERS[1:4]){
  pred.sub <- c(pred.sub,
    predict(update(g, formula = . ~ score, subset = location %in% i), 
            newdata = data.frame(score = seq(46, 75, length = n)), 
            type = 'response'))
}

new.data$pred.sub <- pred.sub

gplot + 
  geom_line(data = new.data, aes(x = score, y = pred.full), color = 'green') + 
  geom_line(data = new.data, aes(x = score, y = pred.sub),  color = 'red')

enter image description here

What I noted and am concerned about is ease to see in facet B. The red curves are the predicted values from models only considering one location, whereas the green curves are predictions using the full data set. The models based on the subset of the data match the plot from stat_smooth.

I would like to plot, with standard error shading, the green curves via ggplot2. I’m sure there is an option somewhere in the code I could use that would do this, but I have yet to find it, or perhaps there is a different order or steps I should follow to get the green curves from a ggplot call. I have found similar problems when plotting everything on one facet and using the color or group aesthetic.

Any suggestions would be greatly appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T19:19:21+00:00Added an answer on May 27, 2026 at 7:19 pm

    You’re correct that the way to do this is to fit the model outside of ggplot2 and then calculate the fitted values and intervals how you like and pass that data in separately.

    One way to achieve what you describe would be something like this:

    preds <- predict(g, newdata = new.data, type = 'response',se = TRUE)
    new.data$pred.full <- preds$fit
    
    new.data$ymin <- new.data$pred.full - 2*preds$se.fit
    new.data$ymax <- new.data$pred.full + 2*preds$se.fit  
    
    ggplot(df,aes(x = score, y = pass)) + 
        facet_wrap(~location) + 
        geom_point() + 
        geom_ribbon(data = new.data,aes(y = pred.full, ymin = ymin, ymax = ymax),alpha = 0.25) +
        geom_line(data = new.data,aes(y = pred.full),colour = "blue")
    

    enter image description here

    This comes with the usual warnings about intervals on fitted values: it’s up to you to make sure that the interval you’re plotting is what you really want. There tends to be a lot of confusion about “prediction intervals”.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Working sample using one Table SELECT t.* FROM ( SELECT TITLE.name, (TITLE.value-TITLE.msp) AS Lower,
Working in Eclipse on a Dynamic Web Project (using Tomcat (v5.5) as the app
Working with an Oracle 9i database from an ASP.NET 2.0 (VB) application using OLEDB.
I am working in LaTeX, and when I create a pdf file (using LaTeX
I'd like to do large-scale regression (linear/logistic) in R with many (e.g. 100k) features,
Working on an old Kohana 2 project and I want to link two models.
Working with stl:list and stl::vector types under interrupt handlers I want to avoid malloc()
working with: ASP.net using VB.net connecting to MS SQL Server What I'm trying to
Working on dom html . I want to convert node value to string: $html
Working with box2d and cocos2d, I've calculated two vectors: one is the vector pointing

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.