Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7508387
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T22:40:25+00:00 2026-05-29T22:40:25+00:00

I have a large data frame (named z ) that looks like this: RPos

  • 0

I have a large data frame (named z) that looks like this:

    RPos    M1
    1   -0.00020
    2   0.00010
    3   -0.00012
    4   -0.00035
    5   -0.00038 
...etc (about 300,000 observations)

It is essentially a time series (although it is actually a data frame, not ts or zoo).
Where RPos is the index number (explicitly stored), and M1 is any metric.

I have another data frame (named actionlist) with about 30,000 *non-consecutive observations. Each value in actionlist’s RPos column represents the last of 34 consecutive points.

My final piece of data is a single data frame (named x) of only 34 consecutive observations.

My goal is to calculate the correlation coefficients between x and each observation in actionlist (which, again, is the end-point of 34 consecutive observations).

To do this I must generate these 34-point consecutive point time series segments from z (the large data frame).

Currently, I am doing it like this:

n1<-33:0
for(i in 1:nrow(actionlist))
{
    crs[i,2]<-cor(z[actionlist$RPos[i]+n1,2],x[,2])  
}

When looking at the Rprof readout this is what I get:

$by.self
              self.time self.pct total.time total.pct
[.data.frame       0.68    25.37       0.98     36.57
.Call              0.22     8.21       0.22      8.21
cor                0.16     5.97       2.30     85.82
...etc

It looks as though [.data.frame is taking the longest.
Specifically I am pretty sure that it is this part:
z[actionlist$RPos[i]+n1,2]

How can I speed up (eliminate the need for?) this part of the function?

I asked a similar question before, except instead of looking within a restricted list (actionlist) I was looking through every possible consecutive 34-observation within z. The answer was posted here, but I cannot figure out how to adapt it to a restricted list.

Any help would be very appreciated!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T22:40:26+00:00Added an answer on May 29, 2026 at 10:40 pm

    The most straightforward is probably to build
    a matrix containing the data you want
    to compute the correlation with, and eschew the loop altogether.

    # Sample data
    n <- 3e5
    m <- 3e4
    k <- 35
    z <- data.frame(
      RPos = 1:n,
      M1   = rnorm(n)
    )
    actionlist <- sample( k:n, m )
    x <- rnorm(k)
    
    system.time( for (j in 1:10) {
      # Index of the observations we want
      i <- sapply( (k-1):0, function(u) actionlist - u )
      # Data we want to compute the correlation with
      y <- matrix( z$M1[i], nr=nrow(i) )
      # Computations
      result <- cor(t(y),x)
    } ) # 150ms per iteration
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a data.frame that looks like this > head(df) Memory Memory Memory Memory
I have a large data.frame displaying some weird properties when plotted. I'd like to
I have a large data.frame, and I'd like to be able to reduce it
I have a data.frame like this: (t=structure(list(count = c(NA, 2, NA, NA, NA, 8,
I have a dataframe in R like this: dat = data.frame(Sample = c(1,1,2,2,3), Start
I have a large data frame that Im working with, the first few lines
I have a large data frame that Im working with, the first few lines
I have a large data.frame that was generated by a process outside my control,
I have a very large possible data set that I am trying to visualize
I have a large data frame with the following fields (example data). #dput(data) gives...

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.