Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6583543
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T16:25:22+00:00 2026-05-25T16:25:22+00:00

I am a novice R user trying to work with a data set of

  • 0

I am a novice R user trying to work with a data set of 40,000 rows and 300 columns. I have found a solution for what I would like to do, however my machine takes over an hour to run my code and I feel like an expert could help me with a quicker solution (as I can do this in excel in half the time). I will post my solution at the end.

What I would like to do is the following:

  1. Compute the average value for each column NY1 to NYn based on the value of the YYYYMMbucket column.

  2. Divide original value by the its average YYYYMMbucket value.

Here is sample of my original data set:

     YYYYMMbucket    NY1  NY2  NY3   NY4
1      200701.3     0.309  NA 20.719 16260
2      200701.3     0.265  NA 19.482 15138
3      200701.3     0.239  NA 19.168 14418
4      200701.3     0.225  NA 19.106 14046
5      200701.3     0.223  NA 19.211 14040
6      200701.3     0.234  NA 19.621 14718
7      200701.3     0.270  NA 20.522 15780
8      200701.3     0.298  NA 22.284 16662
9      200701.2     0.330  NA 23.420 16914
10     200701.2     0.354  NA 23.805 17310
11     200701.2     0.388  NA 24.095 17448
12     200701.2     0.367  NA 23.954 17640
13     200701.2     0.355  NA 23.255 17748
14     200701.2     0.346  NA 22.731 17544
15     200701.2     0.347  NA 22.445 17472
16     200701.2     0.366  NA 21.945 17634
17     200701.2     0.408  NA 22.683 18876
18     200701.2     0.478  NA 23.189 21498
19     200701.2     0.550  NA 23.785 22284
20     200701.2     0.601  NA 24.515 22368

This is what my averages look like:

     YYYYMMbucket  NY1M     NY2M
1      200701.1  0.4424574   NA
2      200701.2  0.4530000   NA
3      200701.3  0.2936935   NA
4      200702.1  0.4624063   NA
5      200702.2  0.4785937   NA
6      200702.3  0.3091161   NA
7      200703.1  0.4159687   NA
8      200703.2  0.4491875   NA
9      200703.3  0.2840081   NA
10     200704.1  0.4279137   NA

How I would like my final output to look:

  NY1avgs   NY2avgs    NY3avgs
1  1.052117     NA  0.7560868
2  0.9023011    NA  0.7109456
3  0.8137734    NA  0.699487
4  0.7661047    NA  0.6972245
5  0.7592949    NA  0.7010562
6  0.7967489    NA  0.7160181
7  0.9193256    NA  0.7488978
8  1.014663     NA  0.8131974
9  0.7284768    NA  0.857904

Here’s how I did it:

First I used “plyr” to compute my averages, simple enough:

test <- ddply(prf.delete2b,. (YYYYMMbucket), summarise, 
    NY1M = mean(NY1), NY2M = mean(NY2) ... ...))

Then used a series of the following:

x <- c(1:40893)

lookv <- function(x,ltab,rcol=2) ltab[max(which(ltab[,1]<=x)),rcol]

NY1Fun <- function(x) (prf.delete2b$NY1[x] / lookv((prf.delete2b$YYYYMMbucket[x]),test,2))

NY2Fun <- function(x) (prf.delete2b$NY2[x] / lookv((prf.delete2b$YYYYMMbucket[x]),test,3))

NY1Avgs <- lapply(x, NY1Fun)
NY2Avgs <- lapply(x, NY2Fun)

I also tried a variant of the above by saying:

NY1Fun <- function(x) (prf.delete2b$NY1[x] / subset(test, YYYYMMbucket == prf.delete2b$YYYYMMbucket[x], select =c(NY1M)))

lapply(x, NY1Fun)

Each variant of NYnFun takes a good 20 seconds to run so doing this 300 times takes much too long. Can anyone recommend any alternative to what I posted or point out any novice mistakes I’ve made?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T16:25:23+00:00Added an answer on May 25, 2026 at 4:25 pm

    How about:

    test2 <- merge(prfdelete2b,test,all.x=TRUE)
    test2[2:ncol(prefdelete2b)]/test2[(ncol(prefdelete2b)+1):ncol(test2)]
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am a novice programmer who is trying to teach myself to code, specifically
I'm a Python novice, trying to use pyCurl. The project I am working on
I am a Delphi novice, but I'm trying to understand the relationship between TApplication
I am novice in sharepoint programming. I have a following code: SPWorkflowTask task =
I am a novice in the world of source/version control and I have been
I have been looking around for a few hours trying to figure out how
I am trying to force a user to login once they call this update
I'm trying to setup nginx with passenger to work on Ubuntu with RVM. I
As a novice developer who is getting into the rhythm of my first professional
As a novice, I've spent time learning a smattering of C and a fair

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.