I want to conditionally subset a dataframe without referencing the dataframe. For example if I have the following:
long_data_frame_name <- data.frame(x=1:10, y=1:10)
I want to say:
subset <- long_data_frame_name[x < 5,]
But instead, I have to say:
subset <- long_data_frame_name[long_data_frame_name$x < 5,]
plyr and ggplot handle this so beautifully. Is there any package that makes subsetting a data frame similarly beautiful?
It sounds like you are looking for the data.table package, which implements indexing syntax just like that which you describe. (
data.tableobjects are essentiallydata.frames with added functionality, so you can continue to use them almost anywhere you would use a “plain old” data.frame.)Matthew Dowle, the package’s author, argues for the advantages of
[.data.table()‘s indexing syntax in his answer to this popular SO [r]-tag question. His answer there could just as well have been written as a direct response to your question above!Here’s an example: