this seems to be rather easy, but it keeps my busy since a while.
I have a dataframe (df) with n columns and a vector with the same number (n) of values.
The values in the vector are thresholds for the observations in the columns in the dataframe. So the clue is, how to tell R to use different thresholds for each column?
I want to keep all the observations in the dataframe which fulfill the various thresholds for each column (above or below, doesnt matter in the example). The observations which do not fulfill the threshold criterion should be set to 0.
I dont want a subset of the dataframe.
Can anyone help? Thanks a lot in advance.
Given some example data and thresholds
we can use the
mapply()function to work out which observations in each column (in this) are greater than or equal to the threshold. Using those indices, we can replace the values corresponding to the indices with0via:Here is the call in action:
It is instructive to notice what
mapply()returns in this case:and it is those logical values that are used to select the observations that meet the threshold. You can a different binary operator to the one I used; see
?">"for the various options. When writing themapply()call, think of it in terms of left-hand-side and right-hand-side of the binary operator, such that anmapply()call would give:where we might write
Update: As @DWin has answered the comment about two thresholds I will update my Answer to match.
We can see which elements match both constraints:
and the same construct can be used to select those elements that match: