Say I have a dataframe df with two or more columns, is there an easy way to use unique() or other R function to create a subset of unique combinations of two or more columns?
I know I can use sqldf() and write an easy "SELECT DISTINCT var1, var2, ... varN" query, but I am looking for an R way of doing this.
It occurred to me to try ftable coerced to a dataframe and use the field names, but I also get the cross tabulations of combinations that don’t exist in the dataset:
uniques <- as.data.frame(ftable(df$var1, df$var2))
uniqueworks ondata.framesounique(df[c("var1","var2")])should be what you want.Another option is
distinctfromdplyrpackage:Note:
For older versions of dplyr (< 0.5.0, 2016-06-24)
distinctrequired additional step(or oldish way
distinct(select(df, var1, var2))).