Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8741337
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T11:17:40+00:00 2026-06-13T11:17:40+00:00

I have the following code using data.frames, and I’m wondering how to write this

  • 0

I have the following code using data.frames, and I’m wondering how to write this using data.tables, using the most efficient, most vectorized code?

data.frame code:

set.seed(1)
to <- cbind(data.frame(time=seq(1:5),bananas=sample(100,5),apples=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
from <- cbind(data.frame(time=seq(1:5),blah=sample(100,5),foo=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
from
to

rownames(to) <- to$time
to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
to

Running this:

>     set.seed(1)
>     to <- cbind(data.frame(time=seq(1:5),bananas=sample(100,5),apples=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
>     from <- cbind(data.frame(time=seq(1:5),blah=sample(100,5),foo=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
>     from
  time blah foo  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
1    1   66  22 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2    2   35  13 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
3    3   27  47 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
4    4   97  90 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5    5   61  58 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
>     to
  time bananas apples  1   2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
1    1      27     90 21  50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
2    2      37     94 18  72 22  2 60 80 65  3 87 32 30 48 84 87 72 72  6 46
3    3      57     65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
4    4      89     62 39  39 13 87 19 73 56 74 25 67 34  9 34 78 33 25 88 82
5    5      20      6 77  78 27 35 83 42 53 70  8 41 66 88 48 97 76 15 78 61
> 
>     rownames(to) <- to$time
>     to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
>     to
  time bananas apples  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
1    1      27     90 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2    2      37     94 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
3    3      57     65 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
4    4      89     62 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5    5      20      6 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79

Basically, we update columns paste0(1:18) of to from columns paste0(1:18) of from, matching up the times.

data.tables apparently have some advantages, such as not needing head when printing them at the console, so I’m thinking about using them.

However I’d like not to have to write the := expressions by hand, ie try to avoid:

to[from,`1`:=i.`1`,`2`:=i.`2`, ..]

I’d also prefer to use vectorized syntax if possible, rather than some kind of for loop, ie try to avoid something like:

for( i in 1:18 ) {
    to[from, sprintf("%d",i) := i.sprintf("%d",i)]
}

I read through the faq vignette, and the datatable-intro vignette, though I admit I probably haven’t understood everything 100%.

I looked at Loop through columns in a data.table and transform those columns , but I can’t say I understand it 100%, and it seems to say that I need to use a for loop?

There does seem to be some kind of a hint at the bottom of 8374816 that it might be possible to just use data frame syntax, adding with=FALSE? But since the data.frame procedure is hacking on the row names, I’m not sure how well / if that will work, and I wonder to what extent that makes use of the efficiencies of data.table?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T11:17:41+00:00Added an answer on June 13, 2026 at 11:17 am

    Good question. The base construct you’ve shown :

    to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
    

    works assuming row names can’t be duplicated, or if they are then only the first is matched to. Here, the LHS of <- has the same number of rows as the RHS of <-.

    data.table is different since routinely, multiple rows in to may match; the default for mult is "all". data.table also prefers long format to wide. So this question is kind of putting data.table through its paces for something it wasn’t really designed for. If you have any NA in those 18 columns (i.e. sparse), then a long format may be more appropriate. If all 18 columns are the same type, then a matrix may be more appropriate.

    That said, here are three data.table options for completeness.

    1. Using := but without a for loop (multiple LHS and multiple RHS in LHS:=RHS)

    from = as.data.table(from)
    to = as.data.table(to)
    from
       time blah foo  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1   66  22 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
    2:    2   35  13 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
    3:    3   27  47 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
    4:    4   97  90 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
    5:    5   61  58 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
    to
       time bananas apples  1   2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1      27     90 21  50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
    2:    2      37     94 18  72 22  2 60 80 65  3 87 32 30 48 84 87 72 72  6 46
    3:    3      57     65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
    4:    4      89     62 39  39 13 87 19 73 56 74 25 67 34  9 34 78 33 25 88 82
    5:    5      20      6 77  78 27 35 83 42 53 70  8 41 66 88 48 97 76 15 78 61
    setkey(to,time)
    setkey(from,time)
    to[from,paste0(1:18):=from[.GRP,paste0(1:18),with=FALSE]]
       time bananas apples  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1      27     90 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
    2:    2      37     94 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
    3:    3      57     65 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
    4:    4      89     62 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
    5:    5      20      6 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
    

    or

    to[from,paste0(1:18):=from[,paste0(1:18),with=FALSE],mult="first"]
       time bananas apples  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1      27     90 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
    2:    2      37     94 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
    3:    3      57     65 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
    4:    4      89     62 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
    5:    5      20      6 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
    

    Note I’m using latest v1.8.3, which is needed for option 1 to work (.GRP has just been added, and the outer with=FALSE is no longer needed).

    2. Use one list column to store the length 18 vectors, rather than 18 columns

    to = data.table( time=seq(1:5),
                     bananas=sample(100,5),
                     apples=sample(100,5),  
                     v18=replicate(5,sample(100,18),simplify=FALSE))
    from =  data.table( time=seq(1:5),
                        blah=sample(100,5),
                        foo=sample(100,5),
                        v18=replicate(5,sample(100,18),simplify=FALSE))
    setkey(to,time)
    setkey(from,time)
    
    from
       time blah foo                 v18
    1:    1   56  97   88,47,1,71,69,18,
    2:    2   69  40   96,99,60,3,33,27,
    3:    3   65  84 100,38,56,72,84,55,
    4:    4   98  74 91,69,24,63,27,100,
    5:    5   46  52    65,4,59,41,8,51,
    
    to
       time bananas apples                 v18
    1:    1      66     73 100,36,74,77,68,46,
    2:    2      19     37   84,88,92,8,37,52,
    3:    3      94     77   37,94,13,7,93,43,
    4:    4      88      2  27,93,71,16,46,66,
    5:    5      91     91   85,94,58,49,19,1,
    
    to[from,v18:=i.v18]
    to
       time bananas apples                 v18
    1:    1      66     73   88,47,1,71,69,18,
    2:    2      19     37   96,99,60,3,33,27,
    3:    3      94     77 100,38,56,72,84,55,
    4:    4      88      2 91,69,24,63,27,100,
    5:    5      91     91    65,4,59,41,8,51,
    

    If you are not used to list column printing, the trailing comma signifies that more items are in that vector. Just the first 6 are printed.

    3. Use data.frame syntax on the data.table

    to = as.data.table(to)
    from = as.data.table(from)
    setkey(to,time)
    setkey(from,time)
    
    from
       time blah foo  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1   66  22 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
    2:    2   35  13 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
    3:    3   27  47 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
    4:    4   97  90 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
    5:    5   61  58 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
    
    to
       time bananas apples  1   2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1      27     90 21  50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
    2:    2      37     94 18  72 22  2 60 80 65  3 87 32 30 48 84 87 72 72  6 46
    3:    3      57     65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
    4:    4      89     62 39  39 13 87 19 73 56 74 25 67 34  9 34 78 33 25 88 82
    5:    5      20      6 77  78 27 35 83 42 53 70  8 41 66 88 48 97 76 15 78 61
    
    to[from, paste0(1:18)] <- from[,paste0(1:18),with=FALSE]
    to
       time bananas apples  1  2   3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
    1:    1      27     90 98  2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
    2:    2      37     94 74 72  50 52  8 57 61 18 56 53 90  7 85 65 20 76 39 12
    3:    3      57     65 36 11  49 21  4 53 24 75 33  8 45 34 86 75 89 73 11 85
    4:    4      89     62 44 45  18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
    5:    5      20      6 15 65  76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
    

    So the LHS of <- can use data.table keyed join syntax; i.e. to[from]. It’s just that this method (currently in R) will copy the entire to dataset. That’s what := was introduced to avoid by providing update by reference. Also, if each row in from matches to multiple rows in to then the RHS of <- would need to expanded to line up (by you the user), otherwise the RHS would be recycled to fill up the LHS. That’s one reason why, in data.table, we like := being inside j, all inside [...].

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following code, z <- data.frame(a=sample.int(10),b=sample.int(10),c=sample.int(10)) letter <- c(a,c,b) # this will
I have following code for inserting data into database using PDO. It inserts data
When I using the following code to read file: lines=file(data.txt).read().split(\n) I have the following
I produced multiple plots using the following code: set.seed(12345) a <- data.frame(Glabel=LETTERS[1:7], A=rnorm(7, mean
I have the following code that I'm using in a small frame work I've
I have following code using hibernate to throw a custom exception on error and
I have the following code using actionscript and the indesign sdk: At the beginning
I have the following code: using System; using System.Diagnostics; using System.IO; using PdfSharp.Pdf.Printing; namespace
I have the following code: using (BinaryReader br = new BinaryReader( File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
I have the following code using jQuery Validate. $(#register).validate({ debug: true, errorClass:'error', validClass:'success', errorElement:'span',

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.