First, this is a very basic question that I’m unsure of how to phrase. If the question is a duplicate (though I checked using what I thought might be appropriate phrasing), I’ll obviously retract and appreciate the link.
Second, I am sure there is an easier way to do what I’m trying, but don’t want to get off-track.
OK. I’m attempting to just get a table of column proportions from a matrix of 0/1’s (the proportion of 1’s conditional on a value of another variable, which is PARTY in this case).
my data.frame is m103, and of dimensions (437,91) and the following process works (as in, produces what I want):
prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
But of course, I want to actually keep the output, and this is where the error arises. If I do this:
a <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
Things are great. But IMMEDIATELY after this, if I try:
m103.avg.prop <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
I get the error:
Error in FUN(X[[2L]], ...) : only defined on a data frame with all numeric variables
I’d like to keep a rational naming scheme going in my code (which the second example would continue), but I can’t tell if this has something to do with what I’ve tried to assign the output to, or something else.
Many thanks!
EDIT: Let’s see if I can be more explicit
#Data import
m103 <- read.csv("103_members_party.csv", header=T)
#See the first few rows/columns
m103[1:5,1:5]
#Produces this:
ID PARTY X930 X461 X137
1 15245 100 0 0 0
2 15000 100 0 0 0
3 29108 200 0 0 0
4 15001 100 0 0 0
5 29132 100 0 0 0
#Sum and get col percentages by PARTY (sums the 1's when PARTY==100, PARTY==200, etc)
#WITHOUT assigning to anything
prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
#Produces:
PARTY V1
[1,] 1.122515e-05 0.580000465
[2,] 2.245030e-05 0.416619418
[3,] 3.681849e-05 0.003309623
#With assignment to a
a <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
a
#Produces
PARTY V1
[1,] 1.122515e-05 0.580000465
[2,] 2.245030e-05 0.416619418
[3,] 3.681849e-05 0.003309623
#Now, assignment to m103.avg.prop
m103.avg.prop <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
#results in error:
Error in FUN(X[[2L]], ...) :
only defined on a data frame with all numeric variables
The error you’re getting is because you’re trying to sum something that isn’t a number. Without reproducible code I can’t tell you exactly what is going on. But, one of the reasons we ask for a reproducible example is that in the process of making one, you will often discover the problem on your own.
In this case, I assume the data came from somewhere like excel, which is notorious for doing surprising things to data. try looking at
str(m103)and one of the column will be a character vector rather than numeric. faulting that, i would have to see your data.However, there should be no difference between your assignment to
aand your assignment tom103.avg.prop. As a side note, I like to avoid numbers in my variable names wherever possible, just to avoid confusing myself!EDIT: Add runable code:
I still cannot replicate your problem. Like I said above, the output of
str(m103)and the output ofstr(a)will be informative. Also,sessionInfo(). Short of that, I’ll stick with my previous guesses…