Say you have an array like
dat <- array(c(126, 100, 35, 61, 908, 688, 497, 807, 913, 747, 336, 598, 235, 172, 58, 121,402, 308, 121, 215, 182, 156, 72, 98, 60, 99, 11, 43, 104, 89, 21, 36), dim = c(2, 2, 8),dimnames = list(a = c(1, 0), b = c(1, 0), c = 1:8))
> > dat
, , c = 1
b
a 1 0
1 126 35
0 100 61
, , c = 2
b
a 1 0
1 908 497
0 688 807
, , c = 3
b
a 1 0
1 913 336
0 747 598
, , c = 4
b
a 1 0
1 235 58
0 172 121
, , c = 5
b
a 1 0
1 402 121
0 308 215
, , c = 6
b
a 1 0
1 182 72
0 156 98
, , c = 7
b
a 1 0
1 60 11
0 99 43
, , c = 8
b
a 1 0
1 104 21
0 89 36
and you want to fit logistic regression to predict a. Is there a simple way to generate a data frame from this array to use in glm? ie a data frame like
a b c
1 1 1 for 126 rows then
...
0 1 1 for 100 rows, etc.
Basically I need to get data to fit logistic regression when given the table with counts. It seems like there should be a simple way of doing it without manually generating the data.
thanks
One way would be to start with the
meltfunction in thereshape2package:Then
dcastthat data to get the numbers of outcomes on one row:You can then use this data to perform a
glmwhere the response is a 2-column matrix giving the numbers of successes and failures:However, the model degrees of freedom (and log-likelihood, etc.) will not reflect the data structure you asked for in your question. To get the specific data structure you were aiming for, you could go back to the
datMobject.EDIT:
The following loops over all columns of
datMexcept for thevaluecolumn, repeating the valuesdatM$valuetimes:Then
cbindthat back into amatrixand convert todata.frameto get the data structure you wanted:The coefficients of the two models are the same:
But, as mentioned, the degrees of freedom, etc will not be: