I’ve never noticed this behavior before, but I’m surprised at the output naming conventions

Question

0

Asked: May 24, 20262026-05-24T22:02:57+00:00 2026-05-24T22:02:57+00:00

I’ve never noticed this behavior before, but I’m surprised at the output naming conventions

0

I’ve never noticed this behavior before, but I’m surprised at the output naming conventions for linear model summaries. My question, essentially, is why row names in a linear model summary always seem to carry the name of the column they came from.

An example

Suppose you had some data for 300 movie audience members from three different cities:

Chicago
Milwaukee
Dayton

And suppose all of them were subjected to the stinking pile of confusing, contaminated waste that was Spider-Man 3. After enduring the entirety of that cinematic abomination, they were asked to rate the movie on a 100-point scale.

Because all of the audience members were reasonable human beings, the ratings were all below zero. (Naturally. Anyone who’s seen the movie would agree.)

Here’s what that might look like in R:

> score <- rnorm(n = 300, mean = -50, sd = 10)
> city  <- rep(c("Chicago", "Milwaukee", "Dayton"), times = 100)
> spider.man.3.sucked <- data.frame(score, city)
> head(spider.man.3.sucked)
      score      city
1 -64.57515   Chicago
2 -50.51050 Milwaukee
3 -56.51409    Dayton
4 -45.55133   Chicago
5 -47.88686 Milwaukee
6 -51.22812    Dayton

Great. So let’s run a quick linear model, assign it to lm1, and get its summary output:

> lm1 <- lm(score ~ city, data = spider.man.3.sucked)
> summary(lm1)

Call:
lm(formula = score ~ city, data = spider.man.3.sucked)

Residuals:
     Min       1Q   Median       3Q      Max 
-29.8515  -6.1090  -0.4745   6.0340  26.2616 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -51.3621     0.9630 -53.337   <2e-16 ***
cityDayton      1.1892     1.3619   0.873    0.383    
cityMilwaukee   0.8288     1.3619   0.609    0.543    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 9.63 on 297 degrees of freedom
Multiple R-squared: 0.002693,   Adjusted R-squared: -0.004023 
F-statistic: 0.4009 on 2 and 297 DF,  p-value: 0.6701

What’s bugging me

The part I want to highlight is this:

cityDayton      1.1892     1.3619   0.873    0.383    
cityMilwaukee   0.8288     1.3619   0.609    0.543

It looks like R sensibly concatenated the column name (city, if you remember from above) with the distinct value (in this case either Dayton or Milwaukee). If I don’t want R to output in that format, is there any way to override it? For example, in my case all I’d need is simply:

Dayton      1.1892     1.3619   0.873    0.383    
Milwaukee   0.8288     1.3619   0.609    0.543

Two questions in one

So,

What’s controlling the format of the output for linear model summary rows, and
Can/should I change it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T22:02:58+00:00

The extractor function for that component of a summary object is coef. Does this provide the means to control your output acceptably:

summ <- summary(lm1)
csumm <- coef(summ)
rownames(csumm) <- sub("^city", "", rownames(csumm))
print(csumm[-1,], digits=4)
#           Estimate Std. Error t value Pr(>|t|)
# Dayton      0.8133      1.485  0.5478   0.5842
# Milwaukee   0.3891      1.485  0.2621   0.7934

(No random seed was set so cannot match your values.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve never noticed this behavior before, but I’m surprised at the output naming conventions

An example

What’s bugging me

Two questions in one

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply