Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8068223
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T12:35:47+00:00 2026-06-05T12:35:47+00:00

This question is similar to a previous one I made here: randomly sum values

  • 0

This question is similar to a previous one I made here: randomly sum values from rows and assign them to 2 columns in R

Since I’m having difficulties with R, this question is both about programming and statistics. I’m very new to both.

I have a data.frame with 219 subjects in one column. The rest of the columns are 7, and in each row I have a number which represents a difference in response time for that particular subject when exposed to the two conditions of the experiment.

This is how the data looks (I’m using the head function, otherwise it would be too long):

    > head(RTsdiff)
      subject   block3diff   block4diff   block5diff   block6diff   block7diff
    1   40002  0.076961798  0.046067460 -0.027012048  0.017920261  0.002660317
    2   40004  0.037558511 -0.016535211 -0.044306743 -0.011541667  0.044422892
    3   40006 -0.017063123 -0.031156150 -0.084003876 -0.070227149 -0.113382784
    4   40008 -0.015204017 -0.009954545 -0.004082353  0.006327839  0.022335271
    5   40009  0.006055829 -0.045376437 -0.002725572  0.016443182  0.032848128
    6   40010 -0.003017857 -0.034398268 -0.034476491  0.014158824 -0.036592982
       block8diff    block9dif
    1  0.03652273  0.037306173
    2 -0.08032784 -0.150682051
    3 -0.09724864 -0.060338684
    4 -0.04783333  0.006539326 
    5 -0.01459465 -0.067916667
    6 -0.01868126 -0.034409584

What I need is a code that will, for every subject (i.e. every row) will sample either 3 or 4 values, average them, and add them to a new vector (called half1). The vector half2 should have the average of the values that were not sampled in the first try.

So, supposing the data.frame I want t create was called “RTshalves”, I would need the first column to be the same column of subjects in RTsdiff, the second column must have in the first row the average of the randomly selected values that correspond to the first subject, and the second column must have the average of the values of the first subject that were not chosen in the first sampling. The second row of columns 2 and 3 should have the same information, but this time for subject 2 (that is subject 40004 in my data.frame), etc, until reaching the 219 subjects.

Let’s suppose that the first sample randomly selected 3 values of subject 1 (block3diff, block5diff and block9diff) and thus the values of block4diff, block6diff, block7diff and block8diff would automatically correspond to the other half. Then, what I would expect to see (considering only the first of the 219 rows) is:

   Subject     Half1       Half2 
    40002   0.02908531   0.02579269

If anyone is interested in the statistics behind this, I’m trying to do a split-half reliability test to check for the consistency of a test. The rationale is that if the difference in RT average is a reliable estimator of the effect, then the differences of half of the blocks of one participant should be correlated to the differences of the other half of the blocks.

Help is much appreciated.
Thanks in advance.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T12:35:51+00:00Added an answer on June 5, 2026 at 12:35 pm

    half1 is easy: write your own function to do what you want to each row (taken in as a vector), then apply it to the rows:

    eachrow <- function(x) {
       mean(sample(x,2))
    }
    RTsdiff$half1 <- apply(eachrow,1,RTsdiff)
    

    To get half2, you’ll probably want to do it at the same time. ddply might be easiest for this (let the by argument be your subject variable to get each row). Like this:

    RTsdiff <- data.frame(subject=seq(6))
    RTsdiff <- cbind( RTsdiff, matrix(runif(6*8),ncol=8) )
    
    library(plyr)
    eachrow <- function(x,n=3) {
      x <- as.numeric(x[,2:ncol(x)]) # eliminate the ID column to make things easier, make a vector
      s <- seq(length(x))
      ones <- sample(s,n) # get ids for half1
      twos <- !(s %in% ones) # get ids for half2
      data.frame( half1=mean(x[ones]), half2=mean(x[twos]) )
    }
    ddply( RTsdiff, .(subject), eachrow)
    
      subject     half1     half2
    1       1 0.4700982 0.5350610
    2       2 0.6173469 0.5351995
    3       3 0.2245246 0.6807482
    4       4 0.6330649 0.6316353
    5       5 0.6388060 0.6629077
    6       6 0.4652086 0.5073034
    

    There are plenty of more elegant ways of doing this. In particular, I used ddply for its ability to easily output data.frames so that I could output both half1 and half2 from the function and have them combined up nicely at the end, but ddply takes data.frames as input, so there’s some slight machination to get it out to a vector first. Feeding sapply a transposed data.frame would possibly be simpler.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

My question is similar to my previous one, but this time I included the
I know this is a similar question to my previous one however its slightly
This is my first post here. My question is similar to a previous thread
This is similar to one of my previous questions here , although this is
I know this question is similar to several previous ones, but I can't find
This is similar to my previous question but it didnt work with the Kindle
This is very similar to a previous question (and may be the exact same
This question is similar in concept to this one , except I see I
This question is similar to this one, but with an extra wrinkle: Auto-removing all
This question is similar to this one How do I add options to a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.