Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6765959
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T14:46:24+00:00 2026-05-26T14:46:24+00:00

I have the following matching problem: I have two data.frames, one with an observation

  • 0

I have the following matching problem: I have two data.frames, one with an observation every month (per company ID), and one with an observation every quarter (per company ID; note that quarter means fiscal quarter; therefore 1Q = Jan, Feb, Mar is not necessarily correct and also, a fiscal quarter is not necessarily 3 month long).

For every month and company, I want to get the correct value of that quarter. Consequently, several months have the same value for one quarter. As an example see the code below:

monthlyData <- data.frame(ID = rep(c("A", "B"), each = 5),
                  Month = rep(1:5, times = 2),
                  MonValue = 1:10)
monthlyData
   ID Month MonValue
1   A     1        1
2   A     2        2
3   A     3        3
4   A     4        4
5   A     5        5
6   B     1        6
7   B     2        7
8   B     3        8
9   B     4        9
10  B     5       10

#Quarterly data, i.e. the value of every quarter has to be matched to several months in d1
#However, I want to match fiscal quarters, which means that one quarter is not necessarily 3 month long
qtrData <- data.frame(ID = rep(c("A", "B"), each = 2),
                  startMonth = c(1, 4, 1, 3),
                  endMonth   = c(3, 5, 2, 5),
                  QTRValue   = 1:4)
qtrData
  ID startMonth endMonth QTRValue
1  A          1        3        1
2  A          4        5        2
3  B          1        2        3
4  B          3        5        4

#Desired output
   ID Month MonValue QTRValue
1   A     1        1        1
2   A     2        2        1
3   A     3        3        1
4   A     4        4        2
5   A     5        5        2
6   B     1        6        3
7   B     2        7        3
8   B     3        8        4
9   B     4        9        4
10  B     5       10        4

Note: This question was posted on R-help months ago, but I didn’t get any answer then and found a solution myself (see R-help). Now, however, I posted a question on stackoverflow where I have a question regarding the data.table where this problem was mentioned as well and there, Andrie asked me to post this question again because he apparently has a good solution for it (see Question on SO)

UPDATE: See Matthew Dowle’s comment: how does the real data look?

This data is a more realistic one. I added a few rows, but the only main part that changed is column endMonth in qtrData. More precisely, the startMonth is not necessarily the endMonth of the previous quarter plus one month anymore. Therefore, using the roll option, I think that you need another line of code (if not, you get 20 rows back, but with Andrie’s solution, which is the desired one, you get 17 rows back). Then there is no performance difference anymore, if I don’t miss anything here.

monthlyData_new <- data.table(ID = rep(c("A", "B"), each = 10),
                  Month = rep(1:10, times = 2),
                  MonValue = 1:20)

qtrData_new <- data.table(ID = rep(c("A", "B"), each = 3),
                  startMonth = c(1, 4, 7, 1, 3, 8),
                  endMonth   = c(3, 5, 10, 2, 5, 10),
                  QTRValue   = 1:6)

setkey(qtrData_new, ID)
setkey(monthlyData_new, ID)

qtrData1 <- qtrData_new
setkey(qtrData1, ID, startMonth)
monthlyData1 <- monthlyData_new
setkey(monthlyData1, ID, Month)

withTable1 <- function(){
  xx <- qtrData1[monthlyData1, roll=TRUE]
  xx <- xx[startMonth <= endMonth]

}

withTable2 <- function(){
  yy <- monthlyData_new[qtrData_new][Month >= startMonth & Month <= endMonth]

}

benchmark(withTable1, withTable2, replications=1e6)
        test replications elapsed relative user.self sys.self user.child sys.child
1 withTable1      1000000   4.244 1.028599     4.232    0.008          0         0
2 withTable2      1000000   4.126 1.000000     4.096    0.028          0         0
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T14:46:25+00:00Added an answer on May 26, 2026 at 2:46 pm

    Here are two solutions, using Base R and data.table. Since the data.table solution is about 30% faster than base R, and also much easier to read, I recommend using data.table for this.


    Base R

    Since you expressed a wish to have this efficient, I use vapply:

    matchData <- function(id, month, data=d2){
      vapply(seq_along(id), 
          function(i)which(
                id[i]==data$ID & 
                    month[i] >= data$startMonth & 
                    month[i] <= data$endMonth),
          FUN.VALUE=1,
          USE.NAMES=FALSE
          )
    }
    
    
    within(monthlyData, 
        Value <- qtrData$QTRValue[matchData(
                   monthlyData$ID, monthlyData$Month, qtrData)]
    )
    
       ID Month MonValue Value
    1   A     1        1     1
    2   A     2        2     1
    3   A     3        3     1
    4   A     4        4     2
    5   A     5        5     2
    6   B     1        6     3
    7   B     2        7     3
    8   B     3        8     4
    9   B     4        9     4
    10  B     5       10     4
    

    data.table

    And also demonstrating how to do this using data.table:

    mD <- data.table(monthlyData, key="ID")
    qD <- data.table(qtrData, key="ID")
    mD[qD][Month>=startMonth & Month<=endMonth]
    
    
          ID Month MonValue startMonth endMonth QTRValue
     [1,]  A     1        1          1        3        1
     [2,]  A     2        2          1        3        1
     [3,]  A     3        3          1        3        1
     [4,]  A     4        4          4        5        2
     [5,]  A     5        5          4        5        2
     [6,]  B     1        6          1        2        3
     [7,]  B     2        7          1        2        3
     [8,]  B     3        8          3        5        4
     [9,]  B     4        9          3        5        4
    [10,]  B     5       10          3        5        4
    

    Benchmark

    I was curious how these two approaches compare:

    library(rbenchmark)
    
    withBase <- function(){
      xx <- within(monthlyData, 
          Value <- qtrData$QTRValue[matchData(monthlyData$ID, monthlyData$Month, qtrData)])
      
    }
    
    withTable <- function(){
      yy <- mD[qD][Month>=startMonth & Month<=endMonth]
      
    }
    
    benchmark(withBase, withTable, replications=1e6)
    
           test replications elapsed relative user.self sys.self user.child
    1  withBase      1000000   10.09 1.296915      7.65     0.21         NA
    2 withTable      1000000    7.78 1.000000      6.38     0.16         NA
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following SQL statement where i'm trying to update multiple rows matching
I have the following code. Here I am matching the vowels characters words :
I have found the following resources on Balanced Matching for .net Regexes: http://weblogs.asp.net/whaggard/archive/2005/02/20/377025.aspx http://blogs.msdn.com/bclteam/archive/2005/03/15/396452.aspx
I have two tables in a SQL Server 2008 database in my company. The
I'm pattern matching on the data constructor of a record, and I have the
I have the following Execution statement which creates a table (using data from another
I am attempting some pattern matching in Lua and have hit a small problem.
I have the following problem: Given a Guice type literal TypeLiteral<T> template and a
I have the following problem with code folding: if I have a class with
I have the following problem: I have a Wintec WBT-202 GPS device which has

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.