Given the dataframes
df1 <- data.frame(CustomerId=c(1:6),Product=c(rep("Toaster",3),rep("Radio",3)))
df2 <- data.frame(CustomerId=c(2,4,6),State=c(rep("Alabama",2),rep("Ohio",1)))
are stored in a list
dflist <- c(df1,df2)
how do I run sqldf queries (joins) on these dataframes?
Failed attempts:
test <- sqldf("select a.CustomerId, a.Product, b.State from dflist[1] a
inner join dflist[2] b on b.id = a.id")
test <- sqldf("select a.CustomerId, a.Product, b.State from dflist$df1 a
inner join dflist$df2 b on b.CustomerId = a.CustomerId")
If you copy your data.frames from the list to a new environment, then you can use the
envirargument tosqldfor by naming the elements of the list, and usingwith.Note a couple of things:
dflistusinglistnotc.note the difference
the named data
Create a new environment to work in
A simpler approach using
withyou could simply use
withwhich evalulates locally (important that dflist is a named list here)Another simple approach using
protoThis uses the
protopackage which is loaded withsqldfUsing data.table
Or you could use
data.tablewhich givessql-likeapproaches (see FAQ 2.16)