I have dataframe that contains 70-80 rows of ordered response time (rt) data for

Question

0

Asked: May 26, 20262026-05-26T01:11:08+00:00 2026-05-26T01:11:08+00:00

I have dataframe that contains 70-80 rows of ordered response time (rt) data for

0

I have dataframe that contains 70-80 rows of ordered response time (rt) data for each of 228 people each with a unique id# (everyone doesn’t have the same amount of rows). I want to bin each person’s RTs into 5 bins. I want the 1st bin to be their fastest 20 percent of RTs, 2nd bin to be their next fastest 20 percent RTs, etc., etc. Each bin should have the same amount of trials in it (unless the total # of trial is odd).

My current dataframe looks like this:

I’d like my new dataframe to look like this:

id   RT    Bin
7000  225    1
7000  250    1

After getting my data to look like this, I will aggregate by id and bin

The only way I can think of to do this is to split the data into a list (using the split command), loop through each person, use the quantile command to get break points for the different bins, assign a bin value (1-5) to every response time. This feels very convoluted (and would be difficult for me). I’m in a bit of a jam and I would greatly appreciate any help in how to streamline this process. Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T01:11:09+00:00

The answer @Chase gave split the range into 5 groups of equal length (difference of endpoints). What you seem to want is pentiles (5 groups with equal number in each group). For that, you need the cut2 function in Hmisc

library("plyr")
library("Hmisc")

dat <- data.frame(id = rep(1:10, each = 10), value = rnorm(100))

tmp <- ddply(dat, "id", transform, hists = as.numeric(cut2(value, g = 5)))

tmp now has what you want

> tmp
    id       value hists
1    1  0.19016791     3
2    1  0.27795226     4
3    1  0.74350982     5
4    1  0.43459571     4
5    1 -2.72263322     1
....
95  10 -0.10111905     3
96  10 -0.28251991     2
97  10 -0.19308950     2
98  10  0.32827137     4
99  10 -0.01993215     4
100 10 -1.04100991     1

With the same number in each hists for each id

> table(tmp$id, tmp$hists)

     1 2 3 4 5
  1  2 2 2 2 2
  2  2 2 2 2 2
  3  2 2 2 2 2
  4  2 2 2 2 2
  5  2 2 2 2 2
  6  2 2 2 2 2
  7  2 2 2 2 2
  8  2 2 2 2 2
  9  2 2 2 2 2
  10 2 2 2 2 2

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have dataframe that contains 70-80 rows of ordered response time (rt) data for

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply