I have a data frame with an id column and some (potentially many) columns with values, here ‘v1’, ‘v2’:
df <- data.frame(id = c(1:5), v1 = c(0,15,9,12,7), v2 = c(9,32,6,17,11))
# id v1 v2
# 1 1 0 9
# 2 2 15 32
# 3 3 9 6
# 4 4 12 17
# 5 5 7 11
-
How can I extract rows where ALL values are larger than a certain value, say 10, which should return:
# id v1 v2 # 2 2 15 32 # 4 4 12 17 -
How can I extract rows with ANY (at least one) value is larger than 10:
# id v1 v2 # 2 2 15 32 # 4 4 12 17 # 5 5 7 11
See functions
all()andany()for the first and second parts of your questions respectively. Theapply()function can be used to run functions over rows or columns. (MARGIN = 1is rows,MARGIN = 2is columns, etc). Note I useapply()ondf[, -1]to ignore theidvariable when doing the comparisons.Part 1:
Part 2:
To see what is going on,
x > 10returns a logical vector for each row (viaapply()indicating whether each element is greater than 10.all()returnsTRUEif all element of the input vector areTRUEandFALSEotherwise.any()returnsTRUEif any of the elements in the input isTRUEandFALSEif all areFALSE.I then use the logical vector resulting from the
apply()callto subset
df(as shown above).