My dataframe looks like this: ID | value A | value B 1 |

Question

0

Asked: May 28, 20262026-05-28T03:59:49+00:00 2026-05-28T03:59:49+00:00

My dataframe looks like this: ID | value A | value B 1 |

0

My dataframe looks like this:

ID | value A | value B
1  |   A1    |   F
1  |   A2    |   N
1  |   A3    |   B
1  |   A4    |   S
2  |   A1    |   B
2  |   A2    |   G
2  |   A3    |   N
3  |   A1    |   F
3  |   A2    |   H
3  |   A3    |   J
3  |   A4    |   N

So I have 4 rows for one ID each. I am trying to use the dcast() function, but it only works if all IDs have the same number of rows. ID No. 2 would be an error case in this example. Is there any easy way to find all IDs that have more or less than 4 rows?
Or may be is there any way to make the dcast function ignore the error cases?

Originally I am trying to reshape the dataframe to get something like this:

ID | A1 | A2 | A3 | A4
 1 | F  | N  | B  | S 
 2 | B  | G  | N  | NA
 3 | F  | H  | J  | N

Apparently the dcast() function from the reshape2 package doesn´t work with irregular IDs. It gives me the following erros message: ‘Aggregation function missing: defaulting to length’ But with a smaller part of my dataset – which doesn´t have those irregular iDs – it works. Any ideas?
Or may be an idea how to reshape my dataframe without using dcast? Thanks!

I am working on a mac with the following (package-) versions:

sessionInfo() 
R version 2.14.1 (2011-12-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.1 plyr_1.7.1    

loaded via a namespace (and not attached):
[1] stringr_0.6

The first column values are all integer, the others character values.

sapply(x, class)
         ID      fach01      f01_lp 
  "integer" "character" "character"

As for the reproducible example:
I hope this helps (I used my original dataframe), however if I only use the first 500 rows of the dataframe dcast() works perfectly fine, the problem occurs when I try to use the whole dataframe of about 140000 rows.

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 
7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L),  A = c("2.LF", 
"1.LF", "3.PF", "4.PF", "3.PF", "1.LF", "2.LF", "3.PF", 
"4.PF", "1.LF", "2.LF", "3.PF", "1.LF", "4.PF", "2.LF", "1.LF", 
"2.LF", "4.PF", "3.PF", "1.LF", "3.PF", "2.LF", "4.PF", "3.PF", 
"4.PF", "1.LF", "2.LF", "4.PF", "2.LF", "3.PF", "1.LF", "1.LF", 
"2.LF", "3.PF", "4.PF"), B = c("Mu/Ku", 
"Fs", "2.AF", "NW", "DE", "2.AF", "MA", "Fs", "2.AF", "NW", 
"NW", "Fs", "2.AF", "bel", "NW", "Fs", "bel", "bel", "NW", "DE", 
"2.AF", "2.AF", "MA", "Fs", "2.AF", "MA", "NW", "DE", "2.AF", 
"MA", "NW", "Mu/Ku", "Fs", "2.AF", "NW")), .Names = c("ID", "A", "B"
), row.names = c("3", "5", "7", "10", "26", "29", "212", "213", 
"32", "35", "38", "39", "43", "44", "45", "48", "53", "56", "57", 
"59", "61", "65", "67", "68", "72", "75", "76", "77", "81", "86", 
"87", "88", "92", "93", "95", "98"), class = "data.frame")

In my original dataframe the values A1 -A4 (here called 1.PF – 4.PF) are not in the right order, this is what I want dcast to do (same as above)

ID | 1.PF | 2.PF | 3.PF | 4.PF
 1 | F    | NW   | DE   | S 
 2 | bel  | G    | N    | <NA>
 3 | F    | NW   | bel  | N

EDIT:

I didn´t solve the dcast() problem, but I found a way to work around it: (reshape() function from the reshape package)

df <- reshape(df, idvar='ID', varying = NULL, timevar = 'value A', direction='wide')

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T03:59:50+00:00

table and which would certainly be the answer to the first question:

 names(table(dfrm$ID))[which(table(dfrm$ID) <4)]
#[1] "2"

As for the second question, maybe you should post the code that is generating the error. At the moment it’s not clear what you are trying (and failing) to do.

EDIT:

If I convert the factor variables to character variables I can get dcast to return the correct object, although my error is different than yours. I got the error in both reshape 1.1 and reshape 1.2.1 on R 2.14.1 on a Mac.

EDIT2: As it turned out the bug was fixed in the newest version of plyr. I get no error with reshape 1.2.1 running with plyr 1.7. You should also update those two packages and restart with a fresh session.

require(reshape2)
dfrm <- structure(list(ID = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3), value.A = structure(c(1L, 
2L, 3L, 4L, 1L, 2L, 3L, 1L, 2L, 3L, 4L), .Label = c("   A1    ", 
"   A2    ", "   A3    ", "   A4    "), class = "factor"), value.B = structure(c(2L, 
6L, 1L, 7L, 1L, 3L, 6L, 2L, 4L, 5L, 6L), .Label = c("   B", "   F", 
"   G", "   H", "   J", "   N", "   S"), class = "factor")), .Names = c("ID", 
"value.A", "value.B"), class = "data.frame", row.names = c(NA, 
-11L))
dcast(dfrm2, ID ~ value.A)
# Using value.B as value column: use value_var to override.
# Error in names(data) <- array_names(res$labels[[2]]) : 
#  'names' attribute [4] must be the same length as the vector [1]
# I first tried removing the leading and trainly spaces with:
dfrm2 <- data.frame(lapply(dfrm, gsub, patt="^\\s+|\\s+$", rep=""))
# Still got the error. Now try to leave as "character" type.

dfrm2 <- data.frame(lapply(dfrm, gsub, patt="^\\s+|\\s+$", rep=""),stringsAsFactors=FALSE)
str(dfrm2)
#-----------------
'data.frame':   11 obs. of  3 variables:
 $ ID     : chr  "1" "1" "1" "1" ...
 $ value.A: chr  "A1" "A2" "A3" "A4" ...
 $ value.B: chr  "F" "N" "B" "S" ...

dcast(dfrm2, ID ~ value.A)
#------------------
Using value.B as value column: use value_var to override.
  ID A1 A2 A3   A4
1  1  F  N  B    S
2  2  B  G  N <NA>
3  3  F  H  J    N

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

My dataframe looks like this: ID | value A | value B 1 |

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply