This is a very simple question.
I have a lengthy dataset and want to create a subset based on certain entries in a particular column. In this case, I am setting it up like this:
Example data:
> NL
SNP alleles
rs1234 A_T
rs1235 A_G
rs2343 A_T
rs2342 G_C
rs1134 C_G
rs1675 T_A
rs8543 A_T
rs2842 G_A
P <- subset(NL, alleles = "A_T", alleles = "T_A", alleles = "G_C", alleles = "C_G")
This runs without error, but the resulting P is not subset in any way (tail of P still shows same number of entries as original NL).
What am I doing wrong?
The most obvious error is using “=” when you mean”==”. But I’m guessing from context that you really want to “split” this data:
Which will create a list of dataframes each of which has one of the values for
alleles.But perhaps you do want to use pattern matching:
And illustrating with what I think was your comment-example:
The subset version: