I am looking for an “elegant” way to basically split a data frame by the levels of one column variable, then create a new output data frame reshaped to now drop the factor variable and add new columns for the levels of the factor variable. I can do this with functions such as the split() method, but this seems to be the messy way to me. I have been trying to do this using the melt() and cast() functions in the plyr package, but haven’t been successful in getting the exact output I need.
Here is what my data looks like:
> jumbo.df = read.csv(...)
> head(jumbo.df)
PricingDate Name Rate
186 2012-03-05 Type A 2.875
187 2012-03-05 Type B 3.250
188 2012-03-05 Type C 3.750
189 2012-03-05 Type D 3.750
190 2012-03-05 Type E 4.500
191 2012-03-06 Type A 2.875
What I would like to do is split by the variable name, remove Name and Rate, then output columns for Type A, Type B, Type C, Type D, and Type E with the corresponding Rate series with Date as ID:
> head(output.df)
PricingDate Type A Type B Type C Type D Type E
2012-03-05 2.875 3.250 3.750 3.750 4.500
2012-03-06 2.875 ...
Thanks!
Not sure if I get you right, but could it be that you just want to reshape your data into the wide format? If so, you have to use the
meltandcastfunctions of thereshape(!) package.reshape2is basically the same. Since your data is already in the molten format, i.e. the long format, a one-liner does what you want: