I am looking for an elegant way to basically split a data frame by

Question

0

Asked: June 5, 20262026-06-05T01:19:53+00:00 2026-06-05T01:19:53+00:00

I am looking for an elegant way to basically split a data frame by

0

I am looking for an “elegant” way to basically split a data frame by the levels of one column variable, then create a new output data frame reshaped to now drop the factor variable and add new columns for the levels of the factor variable. I can do this with functions such as the split() method, but this seems to be the messy way to me. I have been trying to do this using the melt() and cast() functions in the plyr package, but haven’t been successful in getting the exact output I need.

Here is what my data looks like:

> jumbo.df = read.csv(...)
> head(jumbo.df)
         PricingDate  Name     Rate 
    186  2012-03-05   Type A   2.875  
    187  2012-03-05   Type B   3.250  
    188  2012-03-05   Type C   3.750  
    189  2012-03-05   Type D   3.750  
    190  2012-03-05   Type E   4.500  
    191  2012-03-06   Type A   2.875

What I would like to do is split by the variable name, remove Name and Rate, then output columns for Type A, Type B, Type C, Type D, and Type E with the corresponding Rate series with Date as ID:

> head(output.df)
         PricingDate  Type A   Type B    Type C    Type D    Type E 
         2012-03-05    2.875    3.250     3.750     3.750     4.500  
         2012-03-06    2.875    ...

Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T01:19:54+00:00

Not sure if I get you right, but could it be that you just want to reshape your data into the wide format? If so, you have to use the melt and cast functions of the reshape (!) package. reshape2 is basically the same. Since your data is already in the molten format, i.e. the long format, a one-liner does what you want:

df <- read.table(textConnection("PricingDate  Name     Rate 
                                 2012-03-05   TypeA   2.875  
                                 2012-03-05   TypeB   3.250  
                                 2012-03-05   TypeC   3.750  
                                 2012-03-05   TypeD   3.750  
                                 2012-03-05   TypeE   4.500  
                                 2012-03-06   TypeA   2.875"), header=TRUE, row.names=NULL)
library(reshape2)
dcast(df, PricingDate ~ Name)
Using Rate as value column: use value.var to override.
  PricingDate TypeA TypeB TypeC TypeD TypeE
1  2012-03-05 2.875  3.25  3.75  3.75   4.5
2  2012-03-06 2.875    NA    NA    NA    NA

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am looking for an elegant way to basically split a data frame by

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply