I am trying to remove duplicated rows by one column (e.g the 1st column) in an R matrix. How can I extract the unique set by one column from a matrix? I’ve used
x_1 <- x[unique(x[,1]),]
While the size is correct, all of the values are NA. So instead, I tried
x_1 <- x[-duplicated(x[,1]),]
But the dimensions were incorrect.
I think you’re confused about how subsetting works in R.
unique(x[,1])will return the set of unique values in the first column. If you then try to subset using those values R thinks you’re referring to rows of the matrix. So you’re likely getting NAs because the values refer to rows that don’t exist in the matrix.Your other attempt runs afoul of the fact that
duplicatedreturns a boolean vector, not a vector of indices. So putting a minus sign in front of it converts it to a vector of 0’s and -1’s, which again R interprets as trying to refer to rows.Try replacing the ‘-‘ with a ‘!’ in front of
duplicated, which is the boolean negation operator. Something like this: