Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8944249
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T11:58:50+00:00 2026-06-15T11:58:50+00:00

I want to split a data frame based on two columns, but I want

  • 0

I want to split a data frame based on two columns, but I want the output to be a 2-D matrix of data frames, rather than a flat list of data frames. I can achieve what I want using by() and subset but I was told (I think by Ripley) that one should avoid using subset in package development. Is there an elegant alternative (perhaps using split) that preserves the dimnames?

# sample data
df <- data.frame(x=rnorm(20), y=rnorm(20), v1=rep(letters[1:5],each=4), v2=rep(LETTERS[6:9]))

# what I did previously
submat <- by(df, list(df$v1,df$v2), subset)
dim(submat) # 5 x 4
dimnames(submat) # "a" "b" "c" "d" "e" ; "F" "G" "H" "I"
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T11:58:51+00:00Added an answer on June 15, 2026 at 11:58 am

    To get what you ask for, a matrix of dataframes, use tapply with a function that return a particular dataframe subset but with the row names that match the factor levels.

    > dfmat <- with(df, tapply(1:NROW(df), list(v1,v2), function(idx) df[idx,] ) )
    > dfmat[1,1]  # items that are in a single dataframe accessed via matrix indexing
    [[1]]
               x         y v1 v2
    1 -0.5604756 -1.067824  a  F
    
    > dfmat
      F      G      H      I     
    a List,4 List,4 List,4 List,4
    b List,4 List,4 List,4 List,4
    c List,4 List,4 List,4 List,4
    d List,4 List,4 List,4 List,4
    e List,4 List,4 List,4 List,4
    

    Matrices with lists as entries are print-ed to show only the object type and the number of entries (columns in this case). Notice that each entry is a list with one item, so that the dataframe attribute is maintained, but need to “drill down” to get the treasure:
    Edit: added the attributes of dfmat:

    >  attributes(dfmat)
    $dim
    [1] 5 4
    
    $dimnames
    $dimnames[[1]]
    [1] "a" "b" "c" "d" "e"
    
    $dimnames[[2]]
    [1] "F" "G" "H" "I"    
    #------------
    > attributes( dfmat[1,1])
    NULL
    #------------
    > attributes( dfmat[1,1][[1]])
    $names
    [1] "x"  "y"  "v1" "v2"
    
    $row.names
    [1] 1
    
    $class
    [1] "data.frame"
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This is one column of a data frame. I want to further split into
I want to split a data frame like this chr.pos nt.pos CNV 1 74355
I have a log list of data with the first two columns as potential
i am exporting data to csv, while exporting i want to split by every
I want to split data from file which have the following format: {[(1;1;2),(5;1;2),(5;1;1)],[(1;3;2),(5;3;2),(5;1;2)]} And
I have an UTableView grouped. I want to split my data into categories group.
I want to split ByteString to words like so: import qualified Data.ByteString as BS
If I get post multiple word data, I want to split them and save
I've got a sql output into a data.frame which looks like this: dateTime resultMean
I have a data frame with several columns, one of which is a factor

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.