So I have a dataframe with four columns: Course ID, User ID, Day (an

Question

0

Asked: June 14, 20262026-06-14T10:55:30+00:00 2026-06-14T10:55:30+00:00

So I have a dataframe with four columns: Course ID, User ID, Day (an

0

So I have a dataframe with four columns: Course ID, User ID, Day (an integer), and Cumulative Points Received. What I want to do is, for each user-course pair, use lowess to smooth the cumulative points over all of the days of the course. The lowess function takes a vector, applies a smoothing algorithm, and then returns two vectors x and y… I’m only interested in the y vector.

My first idea was

aggregate(df$CumulativePointsReceived, 
          list(df$UserID, df$CourseID),
          function(x) lowess(x)$y)

But that returns a basically unusable dataframe where the third column is a list of those vectors. What I want is a dataframe exactly like the input df, but with a column of the smoothed point values for each user-course-day. I’m sure there’s a non-for-loop way to do this, but I can’t seem to think about it the right way. Thanks in advance…

Here’s the dput of the first user-course pair in the original df. I would have put more, but it gets stupidly large with 110 days for each user-course.

structure(list(CourseID = c(6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L, 6567146L,
6567146L), UserID = c(4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L, 4759679L,
4759679L), DayInCourse = 1:110, CumulativePointsReceived = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 47, 47, 47, 47, 47, 47, 47, 47,
47, 47, 47, 47, 47, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107,
107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107, 107)), .Names =     c("CourseID",
"UserID", "DayInCourse", "CumulativePointsReceived"), row.names =     c(46085L,
46118L, 46120L, 46133L, 46102L, 46086L, 46182L, 46184L, 46159L,
46139L, 46088L, 46090L, 46144L, 46161L, 46187L, 46113L, 46177L,
46193L, 46151L, 46143L, 46126L, 46121L, 46104L, 46170L, 46128L,
46131L, 46167L, 46098L, 46127L, 46178L, 46101L, 46129L, 46152L,
46175L, 46093L, 46122L, 46096L, 46136L, 46106L, 46116L, 46148L,
46173L, 46189L, 46117L, 46172L, 46162L, 46164L, 46108L, 46091L,
46112L, 46135L, 46181L, 46190L, 46171L, 46169L, 46100L, 46141L,
46103L, 46168L, 46110L, 46107L, 46089L, 46154L, 46165L, 46125L,
46163L, 46147L, 46166L, 46183L, 46160L, 46150L, 46097L, 46115L,
46157L, 46194L, 46138L, 46188L, 46153L, 46155L, 46179L, 46180L,
46191L, 46095L, 46176L, 46111L, 46105L, 46142L, 46087L, 46109L,
46158L, 46145L, 46114L, 46192L, 46140L, 46146L, 46174L, 46094L,
46124L, 46149L, 46119L, 46186L, 46130L, 46134L, 46156L, 46185L,
46099L, 46123L, 46137L, 46132L, 46092L), class = "data.frame")

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T10:55:31+00:00

You can do this with base R functions. E.g.

lapply(split(df, list(df$UserID, df$CourseID)),
       function(x) with(x, lowess(DayInCourse, CumulativePointsReceived))$y)

which returns:

$`4759679.6567146`
  [1]  40.92152  42.50447  44.08898  45.67481  47.26167  48.84919
  [7]  50.43697  52.02450  53.61120  55.19639  56.77928  58.35896
 [13]  59.93435  61.50424  63.06724  64.62175  66.16596  67.69780
 [19]  69.21547  70.71909  72.20948  73.68773  75.15522  76.61367
 [25]  78.06516  79.51217  80.95767  82.40508  83.85843  85.32230
 [31]  86.80193  88.30315  89.83235  91.39619  93.00115  94.65248
 [37]  96.35240  98.75650 100.73124 102.31467 103.55841 104.51780
 [43] 105.24556 105.78855 106.18658 106.47246 106.67275 106.80862
 [49] 106.89685 106.95067 106.98051 106.99458 106.99936 107.00000
 [55] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [61] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [67] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [73] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [79] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [85] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [91] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
 [97] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
[103] 107.00000 107.00000 107.00000 107.00000 107.00000 107.00000
[109] 107.00000 107.00000

We can modify this approach to include the transformation step:

out <- lapply(split(df, list(df$UserID, df$CourseID)),
              function(x) transform(x, smooth = lowess(DayInCourse,         
                                    CumulativePointsReceived)$y))

> head(out[[1]])
      CourseID  UserID DayInCourse CumulativePointsReceived   smooth
46085  6567146 4759679           1                        0 40.92152
46118  6567146 4759679           2                        0 42.50447
46120  6567146 4759679           3                        0 44.08898
46133  6567146 4759679           4                        0 45.67481
46102  6567146 4759679           5                        0 47.26167
46086  6567146 4759679           6                        0 48.84919

As you only supplied one course/user combo, the result is a list with just one component. In a real world example, the list would have more components. In such circumstances do

final <- do.call(rbind, out)

The reason the aggregate() step failed is that you are passing lowess() a data frame and it expects two vectors x and y. I don’t think this is the right approach here. Doing the split-apply-combine by hand would be the way to go unless you want to learn plyr.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

So I have a dataframe with four columns: Course ID, User ID, Day (an

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply