Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9245569
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T09:17:43+00:00 2026-06-18T09:17:43+00:00

Related to a previous question I asked ( ggplot2 how to get 2 histograms

  • 0

Related to a previous question I asked (ggplot2 how to get 2 histograms with the y value = to count of one / sum of the count of both), I tried to write a function which would take a data.frame as input with the response times (RT) and accuracy (correct) of several participants in several conditions, and output a “summary” data.frame with the data aggregated like in an histogram. The specificity here is that I don’t want to get the absolute number of responses in each bin, but the relative count.

What I call relative count is that for each bin of the histogram, the value correspond to:

relative_correct   = ncorrect / sum(ncorrect+nincorrect)
relative_incorrect = nincorrect / sum(ncorrect+nincorrect)

The result is actually close to a density plot, except that it’s not the sum of each curve which is equal to 1 but the sum of the correct and incorrect curves.

Here is the code to create sample data:

# CREATE EXAMPLE DATA
subjectname <- factor(rep(c("obs1","obs2"),each=50))
Visibility  <- factor(rep(rep(c("cond1","cond2"),each=25),2)) 
RT          <- rnorm(100,300,50)
correct     <- sample(c(rep(0,25),rep(1,75)),100)
my.data <- data.frame(subjectname,Visibility,RT,correct)

First I need to define a function to be used later in a ddply

histRTcounts <- function(df) {out = hist(df$RT, breaks=seq(5, 800, by=10), plot=FALSE)
                          out = out$counts}

And then the main function (there is 2 small issues which prevent it to work as inside a function, see the lines with ?????, but outside of a function this code works).

relative_hist_count <- function(df, myfactors) {
  require(ggplot2)
  require(plyr)
  require(reshape2)

  # ddply it to get one column for each bin of the histogram
  myhistRTcounts <- ddply(df, c(myfactors,"correct"), histRTcounts)

  # transform it in long format
  myhistRTcounts.long = melt(myhistRTcounts, id.vars =c(myfactors,"correct"), variable.name="bin", value.name = 'mycount')

  # rename the bin names with the ms value they correspond to
  levels(myhistRTcounts.long$bin) <- seq(5, 800, by=10)[-1]-5

  # make them numeric and not a factor anymore
  myhistRTcounts.long$bin = as.numeric(levels(myhistRTcounts.long$bin))[myhistRTcounts.long$bin]

  # cast to have count_correct and count_incorrect as columns
  # ??????????????????????? problem when putting that into a function
  # Here I was not able to figure out how to combine myfactors to the other variables in the call
  myhistRTcount.short = dcast(myhistRTcounts.long, subjectname + Visibility + bin ~ correct)
  names(myhistRTcount.short)[4:5] <- c("countinc","countcor")

  # compute relative counts
  myhistRTcounts.rel <- ddply(myhistRTcount.short, myfactors, transform, 
                          incorrect = countinc / sum(countinc+countcor),
                          correct = countcor / sum(countinc+countcor)
  )
  myhistRTcounts.rel = subset(myhistRTcounts.rel,select=c(-countinc,-countcor))

  myhistRTcounts.rel.long = melt(myhistRTcounts.rel, id.vars = c(myfactors,"bin"), variable.name = 'correct', value.name = 'mycount')

  # ??????????????????????? idem here, problem when putting that into a function to call myfactors
  ggplot(data=myhistRTcounts.rel.long, aes(x=bin, y=mycount, color=factor(correct))) + geom_line() + facet_grid(Visibility ~ subjectname) + xlim(0, 600) + theme_bw()

  return(myhistRTcounts.rel.long)

The call to apply it to the data

new.df = relative_hist_count(my.data, myfactors = c("subjectname","Visibility"))

So first, I would need your help to be able to make that work as a function with the possibility to use the myfactors variable in dcast() and ggplot().

But more importantly, I’m almost sure this function could be written much more elegantly and in a most straightforward manner, with less steps.

Thank you in advance for your help!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T09:17:44+00:00Added an answer on June 18, 2026 at 9:17 am

    Thanks Roland, I did not think about writing a homemade hist function. Please find it below:

    RelativeHistRT <- function (df, breaks = seq(5,800,10)) 
    {
      distrib.correct   = hist(df$RT[df$correct==1], breaks, right=FALSE, plot=FALSE)
      distrib.incorrect = hist(df$RT[df$correct==0], breaks, right=FALSE, plot=FALSE)
    
      n.total = sum(distrib.correct$counts) + sum(distrib.incorrect$counts)
    
      data.frame(bin_mids  = distrib.correct$mids,
             correct   = distrib.correct$counts / n.total,
             incorrect = distrib.incorrect$counts / n.total)
    }
    

    And to apply it to my original data.frame and get what I was looking for:

    myhistRTcounts <- ddply(my.data, .(subjectname,Visibility), RelativeHistRT)
    

    This is indeed much shorter and does exactly what I was looking for.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

(Related to a previous unanswered question I asked). I want to implement a function
This is a related to a previous question I have asked here, see the
This is related to my previous question , but a different one. I have
This is related to a previous question I asked about conditionally sending document data
This question is related to a previous question I asked, but it's a different.
This is related to a previous question I asked with Jquery. If I have
I asked a related question here How do I programatically write parameters into the
I asked a previous question here and one part of it seems to be
This is closely related to a previous question i asked. I have a many-to-many
This one is related to my previous question on the performance of Arrays and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.