I retrieve data from a SQL data frame in R using:
query <- "SELECT date, identifier, somevalue FROM mytable"
data <- sqlQuery(conn, query)
Which gives me:
> data
date identifier somevalue
1 2011-01-01 1 0.50
2 2011-01-02 1 0.40
3 2011-01-01 2 0.70
4 2011-01-02 2 0.10
5 2011-01-03 2 0.25
data <- data.frame(date=c("2011-01-01","2011-01-02","2011-01-01","2011-01-02","2011-01-03"), identifier=c(1,1,2,2,2), somevalue=c(0.5,0.4,0.7,0.1,0.25))
I would like to convert this into a numeric matrix using date as the rownames and identifier as the colnames:
> output
1 2
2011-01-01 0.5 0.70
2011-01-02 0.4 0.10
2011-01-03 NA 0.25
output <- matrix(c(0.5,0.4,NA,0.7,0.1,0.25),3)
rownames(output) <- c("2011-01-01","2011-01-02","2011-01-03")
colnames(output) <- c(1,2)
I can’t figure out how to do this. I’ve looked into reshape and also into match but I always fail due to having duplicate rownames or identifiers.
I generally use
dcastfrom reshape2 (but there are copious ways of doing this):The one hiccup I ran into was using the “right” NA value, which in this case was
NA_real_.dcastwas throwing an error if I tried eitherNAorNA_integer_which doesn’t make much sense to me, but I haven’t thought about it for very long.Edit Ok, now I get it. The NA type needs to match the type of the rest of the data apparently. I was expecting
dcastto be able to convert to the appropriate NA type, but I guess not.