I have a data frame of the form:
Family Code Length Type
1 A 1 11 Alpha
2 A 3 8 Beta
3 A 3 9 Beta
4 B 4 7 Alpha
5 B 5 8 Alpha
6 C 6 2 Beta
7 C 6 5 Beta
8 C 6 4 Beta
I would like to reduce the data set to one containing unique values of Code by taking a mean of Length values, but to retain all string variables too, i.e.
Family Code Length Type
1 A 1 11 Alpha
2 A 3 8.5 Beta
3 B 4 7 Alpha
5 B 5 8 Alpha
6 C 6 3.67 Beta
I’ve tried aggregate() and ddply() but these seem to replace strings with NA and I’m struggling to find a way round this.
Since
FamilyandTypeare constant within aCodegroup, you can “group” on those as well without changing anything when you useddply. If your original data set wasdatgives
If
FamilyandTypeare not constant within aCodegroup, then you would need to define how to summarize/aggregate those values. In this example, I just take the single unique value:Update
Similar options using
dplyrareand