I have a data frame, such as the following: name day wages 1 Ann

Question

0

Asked: June 2, 20262026-06-02T05:20:46+00:00 2026-06-02T05:20:46+00:00

I have a data frame, such as the following: name day wages 1 Ann

0

I have a data frame, such as the following:

  name day wages
1  Ann   1   100
2  Ann   1   150
3  Ann   2   200
4  Ann   3   150
5  Bob   1   100
6  Bob   1   200
7  Bob   1   150
8  Bob   2   100

For every unique name/day pair, I would like to calculate a range of totals, such as ‘number of times wages was greater than 175 on current or next day for this person’. There are many more columns than wages and there are four time-slices to be applied to each total for each row.

I can currently accomplish by unique’ing my data frame:

df.unique <- df[!duplicated(df[,c('name','day')]),]

And then for every row in df.unique, applying the following function (written longhand for clarity) to df:

for(i in 1:nrow(df.unique)) {
    df.unique[i,"wages_gt_175_day_and_next"] <- wages_gt_for_person_today_or_next(df,175,df.unique[i,"day"],df.unique[i,"name"])
}

wages_gt_for_person_today_or_next <- function(df,amount,day,person) {
  temp <- df[df$name==person,]
  temp <- temp[temp$day==day|temp$day==day+1,]
  temp <- temp[temp$wages > amount,]
  return(nrow(temp))
}

Giving me, in this trivial example:

name day wages_gt_175_day_and_next
Ann   1   1
Ann   2   1
Ann   3   0
Bob   1   1
Bob   2   0

However, this seems an extremely slow approach, given that I have hundreds of thousands of rows. Is there a cleverer way of doing this? Something with matrix operations, apply, sqldf, anything like that?

Code to recreate example df:

structure(list(name = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("Ann", "Bob"), class = "factor"), day = c(1, 
1, 2, 3, 1, 1, 1, 2), wages = c(100, 150, 200, 150, 100, 200, 
150, 100)), .Names = c("name", "day", "wages"), row.names = c(NA, 
-8L), class = "data.frame")

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T05:20:49+00:00

Editorial Team

2026-06-02T05:20:49+00:00Added an answer on June 2, 2026 at 5:20 am

Going simply from your example output, here’s something a bit fancier using data.table:

require(data.table)
DT <- data.table(df)
setkey(DT,name,day)

DT[,list(gt175 = sum(wages >= 175)),list(name,day)][,list(day = day,gt175 = as.integer(gt175 + c(tail(gt175,-1),0) > 0)),list(name)]

This is a little convoluted, but should be fast.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a data frame, such as the following: name day wages 1 Ann

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply