This is a followup question to this question , initially inspired by this question

Question

0

Editorial Team

Asked: June 1, 20262026-06-01T23:37:36+00:00 2026-06-01T23:37:36+00:00

This is a followup question to this question , initially inspired by this question

0

This is a followup question to this question, initially inspired by this question, but not quite the same.

This is my situation. First I pull some data from a database,

df <- data.frame(id = c(1:6),
                 profession = c(1, 5, 4, NA, 0, 5))
   df
#  id profession
#  1          1
#  2          5
#  3          4
#  4         NA
#  5          0
#  6          5

Second, I pull a key-table with human readable information about the profession codes,

profession.codes <- data.frame(profession.code = c(1,2,3,4,5),
                               profession.label = c('Optometrists',
                               'Accountants', 'Veterinarians', 
                               'Financial analysts',  'Nurses'))                 
   profession.codes
#  profession.code   profession.label
#               1       Optometrists
#               2        Accountants
#               3      Veterinarians
#               4 Financial analysts
#               5             Nurses

Now, I would like to overwrite the profession variable in my df with the labels from profession.codes, preferably using join from the plyr package, but I’m open to any smart solution. Though I do like that ply preserves the order of x.

I currently do it like this,

# install.packages('plyr', dependencies = TRUE)
library(plyr)

profession.codes$profession <- profession.codes$profession.code
df <- join(df, profession.codes, by="profession")
# levels(df$profession.label)
df$profession.label <- factor(df$profession.label, 
   levels = c(levels(df$profession.label), 
   setdiff(df$profession, df$profession.code)))
# levels(df$profession.label)
df$profession.label[df$profession==0 ] <- 0
df$profession.code <- NULL
df$profession  <- NULL
names(df) <- c("id", "profession")
df
#  id         profession
#  1       Optometrists
#  2             Nurses
#  3 Financial analysts
#  4               <NA>
#  5                  0
#  6             Nurses

This is how I overwrite profession without losing the NA and the 0.

The problem is that the 0 could be a 17 or any number and I would like to account for that in some way. Furthermore, I would also like to shorten my code, if possible.

Any help would be greatly appreciated.

Thanks,
Eric

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T23:37:37+00:00

This is one approach in base:

df <- data.frame(id = c(1:6),
                 profession = c(1, 5, 4, NA, 0, 5))

pc <- data.frame(profession.code = c(1,2,3,4,5),
                               profession.label = c('Optometrists',
                               'Accountants', 'Veterinarians', 
                               'Financial analysts',  'Nurses'))  


df$new <- as.character(pc[match(df$profession,  
    pc$profession.code), 'profession.label'])
df[is.na(df$new), 'new'] <- df[is.na(df$new), 'profession'] 
df$new <- as.factor(df$new)
df

Which yields:

  id profession                new
1  1          1       Optometrists
2  2          5             Nurses
3  3          4 Financial analysts
4  4         NA               <NA>
5  5          0                  0
6  6          5             Nurses

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

This is a followup question to this question , initially inspired by this question

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply