Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7972343
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T07:50:19+00:00 2026-06-04T07:50:19+00:00

With the recent introduction of the package dataframe , I thought it was time

  • 0

With the recent introduction of the package dataframe, I thought it was time to properly benchmark the various data structures and to highlight what each is best at. I’m no expert at the different strengths of each, so my question is, how should we go about benchmarking them.

Some (rather crude) things I have tried:

library(microbenchmark)
library(data.table)
mat <- matrix(rnorm(10000), nrow = 100)
mat2df.base <- data.frame(mat)
library(dataframe)
mat2df.dataframe <- data.frame(mat)
mat2dt <- data.table(mat)
bm <- microbenchmark(t(mat), t(mat2df.base), t(mat2df.dataframe), t(mat2dt), times = 1000)

Results:

Unit: microseconds
                 expr      min       lq   median       uq       max
1              t(mat)   20.927   23.210   31.201   36.908   951.591
2      t(mat2df.base)  929.903  974.039  997.439 1040.814 28270.717
3 t(mat2df.dataframe)  924.957  969.093  992.683 1025.404 27255.205
4           t(mat2dt) 1749.465 1817.382 1857.903 1909.649  5347.321
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T07:50:20+00:00Added an answer on June 4, 2026 at 7:50 am

    I’m no data.table expert, but from what I understand its primary advantage is in indexing. So try subsetting with the various packages to compare speeds.

    library(microbenchmark)
    library(data.table)
    mat <- matrix(rnorm(1e7), ncol = 10) 
    key <- as.character(sample(1:10,1e6,replace=TRUE))
    mat2df.base <- data.frame(mat)
    mat2df.base$key <- key
    
    bm.before <- microbenchmark( 
      mat2df.base[mat2df.base$key==2,] 
    )
    
    library(dataframe)
    mat2df.dataframe <- data.frame(mat)
    mat2df.dataframe$key <- key
    mat2dt <- data.table(mat)
    mat2dt$key <- key
    setkey(mat2dt,key)
    
    
    bm.subset <- microbenchmark( 
      mat2df.base[mat2df.base$key==2,], 
      mat2df.dataframe[mat2df.dataframe$key==2,],
      mat2dt["2",]
      )
    
                                           expr       min        lq    median   
    
        uq       max
    1           mat2df.base[mat2df.base$key == 2, ] 153.99596 154.98602 155.91621 157.0894 194.24456
    2 mat2df.dataframe[mat2df.dataframe$key == 2, ] 153.63907 154.66295 155.68553 156.9827 173.76913
    3                                 mat2dt["2", ]  15.51085  15.66742  15.72899  15.8463  22.53044
    

    With a sufficiently large matrix, data.table wipes the table with the other options.

    Also, I suspect that @RJ- ‘s attempt to compare the performance of base data.frame with the package dataframe‘s data.frames is not working. The performances are just too similar, and I suspect the results are those of the loaded library not of base.

    Edit: Tested. Doesn’t seem to make much of a difference. bm.after is the same code as bm.subset above, just run at the same time as bm.before to provide an accurate comparison.

    bm.before <- microbenchmark( 
      mat2df.base[mat2df.base$key==2,] 
    )
    
    > bm.after
    Unit: milliseconds
                                               expr       min        lq    median        uq       max
    1           mat2df.base[mat2df.base$key == 2, ] 160.62708 166.25787 167.52325 169.18710 173.47864
    2 mat2df.dataframe[mat2df.dataframe$key == 2, ] 163.30259 166.00588 167.80138 169.24647 174.05713
    3                                 mat2dt["2", ]  16.16117  16.89627  17.09047  17.37057  62.01954
    
    > bm.before
    Unit: milliseconds
                                     expr     min       lq   median       uq      max
    1 mat2df.base[mat2df.base$key == 2, ] 159.178 160.9867 162.1149 164.0046 195.9501
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

My recent app is like a forum, i use listview to show each thread.
A recent introduction to Smalltalk has enlightened me on the application and benefits of
A recent task I have to figure out, gives me some hard time thinking
recent years, we have used CUDA for time-critical tasks within many of our 64-bit
Following recent hardware problems, I attempted to switch a couple of our websites to
A recent problem* left me wondering whether there is a text editor out there
A recent post by John Gruber notes that the following legalese: 3.3.1 — Applications
A recent question contains a problem that I many times used to think about
Since recent update Xcode 4.3 now seems to default to LLDB debugger. I just
In a recent interview, I was asked to implement a thread safe generic (i.e.template

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.