Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7921739
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T16:41:52+00:00 2026-06-03T16:41:52+00:00

I need to calculate and add to a data frame multiple new columns based

  • 0

I need to calculate and add to a data frame multiple new columns based on the values in each column in a subset of columns in the data frame. These columns all hold time series data (there is a common date column). For example I need to calculate the change for the same month in the previous year for a dozen columns. I could specify them and calculate them individually but that becomes onerous with a large number of columns to transform, so I am trying to automate the process with a for loop.

I was doing OK until I tried to use ddply to create a column for the running total of the value for the year so far. What happens is that ddply is adding new rows during each iteration through the loop and including those new rows in the cumsum calculation. I have two questions.

Q. How can I get ddply to calculate the correct cumsum?
Q. How can I specify the name of the column during the ddply call, rather than using a dummy value and renaming afterward?

[Edit: I spoke too soon, the updated code below does NOT work at this point, just FYI]

require(lubridate)
require(plyr)
require(xts)

set.seed(12345)
# create dummy time series data
monthsback <- 24
startdate <- as.Date(paste(year(now()),month(now()),"1",sep = "-")) - months(monthsback)
mydf <- data.frame(mydate = seq(as.Date(startdate), by = "month", length.out = monthsback),
                   myvalue1 = runif(monthsback, min = 600, max = 800),
                   myvalue2 = runif(monthsback, min = 200, max = 300))

mydf$year <- as.numeric(format(as.Date(mydf$mydate), format="%Y"))
mydf$month <- as.numeric(format(as.Date(mydf$mydate), format="%m"))
newcolnames <- c('myvalue1','myvalue2')

for (i in seq_along(newcolnames)) {
    print(newcolnames[i])
    mydf$myxts <- xts(mydf[, newcolnames[i]], order.by = mydf$mydate)
    ## Calculate change over same month in previous year
    mylag <- 12
    mydf[, paste(newcolnames[i], "_yoy", sep = "", collapse = "")] <- as.numeric(diff(mydf$myxts, lag = mylag)/ lag(mydf$myxts, mylag))
    ## Calculate change over previous month
    mylag <- 1
    mydf[, paste(newcolnames[i], "_mom", sep = "", collapse = "")] <- as.numeric(diff(mydf$myxts, lag = mylag)/ lag(mydf$myxts, mylag))

    ## Calculate cumulative figure
    #mydf$newcol <- as.numeric(mydf$myxts)
    mydf$newcol <- 1
    mydf <- ddply(mydf, .(year), transform, newcol = cumsum(as.numeric(mydf$myxts)))
    colnames(mydf)[colnames(mydf)=="newcol"] <- paste(newcolnames[i], "_cuml", sep = "", collapse = "")

}

mydf
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T16:41:56+00:00Added an answer on June 3, 2026 at 4:41 pm

    In your loop, since myxts is not part of the data frame, it is not split up in the ddply statement along with everything else. Change it to:

    mydf$myxts <- xts(mydf[, newcolnames[i]], order.by = mydf$mydate)
    

    I don’t know of any way to use dynamically generated names with transform.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've a need to add method that will calculate a weighted sum of worker
I need to calculate the within and between run variances from some data as
I need to calculate the number of seconds passed since the beginning of the
I need to calculate Math.exp() from java very frequently, is it possible to get
I need to calculate averages, standard deviations, medians etc for a bunch of numerical
I need to calculate the number of FULL month in SQL, i.e. 2009-04-16 to
I need to calculate the distance (in meters and miles) between two coordinates given
I need to calculate date (year, month, day) which is (for example) 18 working
I need to calculate the quantity of nights (stay at a hotel) from the
I need to calculate permutations iteratively. The method signature looks like: int[][] permute(int n)

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.