Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 752241
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T14:45:31+00:00 2026-05-14T14:45:31+00:00

I’m going to use the sample code from http://gettinggeneticsdone.blogspot.com/2009/11/split-apply-and-combine-in-r-using-plyr.html for this example. So, first,

  • 0

I’m going to use the sample code from http://gettinggeneticsdone.blogspot.com/2009/11/split-apply-and-combine-in-r-using-plyr.html for this example. So, first, let’s copy their example data:

mydata=data.frame(X1=rnorm(30), X2=rnorm(30,5,2),
SNP1=c(rep("AA",10), rep("Aa",10), rep("aa",10)),
SNP2=c(rep("BB",10), rep("Bb",10), rep("bb",10)))

I am going to ignore SNP2 in this example and just pretend the values in SNP1 denote group membership. So then, I may want some summary statistics about each group in SNP1: “AA”, “Aa”, “aa”.

Then if I want to calculate the means for each variable, it makes sense (modifying their code slightly) to use:

> ddply(mydata, c("SNP1"), function(df)
data.frame(meanX1=mean(df$X1), meanX2=mean(df$X2)))
  SNP1      meanX1   meanX2
1   aa  0.05178028 4.812302
2   Aa  0.30586206 4.820739
3   AA -0.26862500 4.856006

But what if I want the sample covariance matrix for each group? Ideally, I would like a 3D array, where the I have the covariance matrix for each group, and the third dimension denotes the corresponding group. I tried a modified version of the previous code and got the following results that have convinced me that I’m doing something wrong.

> daply(mydata, c("SNP1"), function(df) cov(cbind(df$X1, df$X2)))
, ,  = 1


SNP1         1          2
  aa 1.4961210 -0.9496134
  Aa 0.8833190 -0.1640711
  AA 0.9942357 -0.9955837

, ,  = 2


SNP1          1        2
  aa -0.9496134 2.881515
  Aa -0.1640711 2.466105
  AA -0.9955837 4.938320

I was thinking that the dim() of the 3rd dimension would be 3, but instead, it is 2. Really this is a sliced up version of the covariance matrix for each group. If we manually compute the sample covariance matrix for aa, we get:

           [,1]       [,2]
[1,]  1.4961210 -0.9496134
[2,] -0.9496134  2.8815146

Using plyr, the following gives me what I want in list() form:

> dlply(mydata, c("SNP1"), function(df) cov(cbind(df$X1, df$X2)))
$aa
           [,1]       [,2]
[1,]  1.4961210 -0.9496134
[2,] -0.9496134  2.8815146

$Aa
           [,1]       [,2]
[1,]  0.8833190 -0.1640711
[2,] -0.1640711  2.4661046

$AA
           [,1]       [,2]
[1,]  0.9942357 -0.9955837
[2,] -0.9955837  4.9383196

attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
  SNP1
1   aa
2   Aa
3   AA

But like I said earlier, I would really like this in a 3D array. Any thoughts on where I went wrong with daply() or suggestions? Of course, I could typecast the list from dlply() to a 3D array, but I’d rather not do this because I will be repeating this process many times in a simulation.

As a side note, I found one method (http://www.mail-archive.com/r-help@r-project.org/msg86328.html) that provides the sample covariance matrix for each group, but the outputted object is bloated.

Thanks in advance.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T14:45:32+00:00Added an answer on May 14, 2026 at 2:45 pm

    daply makes the splitting variable the first dimension in the array.

    a <- daply(mydata, c("SNP1"), function(df) cov(cbind(df$X1, df$X2)))
    l <- dlply(mydata, c("SNP1"), function(df) cov(cbind(df$X1, df$X2)))
    

    This is so that a[1, , ] and l[[1]] correspond to the same output. As wkmor1 suggests, you can use aperm to rearrange the dimensions, but I’d like to know more about why the initial form doesn’t suit your needs.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 360k
  • Answers 360k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer You use DateTime.ParseExact or DateTime.TryParseExact. You pass through the exact… May 14, 2026 at 2:45 pm
  • Editorial Team
    Editorial Team added an answer Javascript runs before images have loaded by default. The window… May 14, 2026 at 2:45 pm
  • Editorial Team
    Editorial Team added an answer No, you don't need to include them all. However, the… May 14, 2026 at 2:45 pm

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.