What is the best way to determine a factor or create a new category

Question

0

Asked: May 25, 20262026-05-25T01:03:34+00:00 2026-05-25T01:03:34+00:00

What is the best way to determine a factor or create a new category

0

What is the best way to determine a factor or create a new category field based on a number of boolean fields? In this example, I need to count the number of unique combinations of medications.

   > MultPsychMeds
       ID OLANZAPINE HALOPERIDOL QUETIAPINE RISPERIDONE
    1   A          1           1          0           0
    2   B          1           0          1           0
    3   C          1           0          1           0
    4   D          1           0          1           0
    5   E          1           0          0           1
    6   F          1           0          0           1
    7   G          1           0          0           1
    8   H          1           0          0           1
    9   I          0           1          1           0
    10  J          0           1          1           0

Perhaps another way to state it is that I need to pivot or cross tabulate the pairs. The final results need to look something like:

Combination            Count
OLANZAPINE/HALOPERIDOL     1
OLANZAPINE/QUETIAPINE      3
OLANZAPINE/RISPERIDONE     4
HALOPERIDOL/QUETIAPINE     2

This data frame can be replicated in R with:

MultPsychMeds <- structure(list(ID = structure(1:10, .Label = c("A", "B", "C", 
"D", "E", "F", "G", "H", "I", "J"), class = "factor"), OLANZAPINE = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L), HALOPERIDOL = c(1L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L), QUETIAPINE = c(0L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 1L, 1L), RISPERIDONE = c(0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 0L, 0L)), .Names = c("ID", "OLANZAPINE", "HALOPERIDOL", 
"QUETIAPINE", "RISPERIDONE"), class = "data.frame", row.names = c(NA, 
-10L))

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T01:03:35+00:00

Here’s one approach using the reshape and plyr packages:

library(reshape)
library(plyr)

#Melt into long format
dat.m <- melt(MultPsychMeds, id.vars = "ID")
#Group at the ID level and paste the drugs together with "/"
out <- ddply(dat.m, "ID", summarize, combos = paste(variable[value == 1], collapse = "/"))

#Calculate a table
with(out, count(combos))

                       x freq
1 HALOPERIDOL/QUETIAPINE    2
2 OLANZAPINE/HALOPERIDOL    1
3  OLANZAPINE/QUETIAPINE    3
4 OLANZAPINE/RISPERIDONE    4

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

What is the best way to determine a factor or create a new category

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply