Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8498071
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T00:18:11+00:00 2026-06-11T00:18:11+00:00

Update: My apologies… I should have specified that there are missing values for some

  • 0

Update: My apologies… I should have specified that there are missing values for some tests (come conducted n times, others conducted m times), and I’m thinking that’s going to affect the answer based on the merge documentation:

all.x logical; if TRUE, then extra rows will be added to the output, one for each row in x that has no matching row in y. These rows will have NAs in those columns that are usually filled with values from y. The default is FALSE, so that only rows with data from both x and y are included in the output.

I don’t want either of these options: namely, I don’t want only matching row/column values kept, nor do I want non-matching row/column intersection values filled in with NAs.


I have data from an experiment and stored my chemical compositions in one file and testing in another. I’d like to merge these so I can plot boxplots faceted by composition factor level, but am having a hard time figuring out how to do it. Here’s some example data (adjusted per the above):

set.seed(1)
df.ids <- data.frame(lot=c("A","B","C"),
                     var1=c(1,2,3),
                     var2=c(4,5,6))

df.testing <- data.frame(lot=c(rep("A",4), rep("B",4),rep("C",4)),
                         test1=rep(c(runif(3,min=1,max=5),NA),3),
                         test2=rep(c(runif(2,min=1,max=5),NA,NA),3),
                         test3=rep(runif(4,min=1,max=5),3))

Like my real data, the ids data frame has one row per factor level, but the test data has multiple for each test replicate. How might I merge these so that I fill in the compositions for each of the lots? In other words, my resultant data frame would look like this:

| lot |    test1 | test2    |    test3 | var1 | var2 |
|-----+----------+----------+----------+------+------|
| A   | 2.062035 | 4.632831 | 4.593559 | 1    | 4    |
| A   | 2.488496 | 1.806728 | 4.778701 | 1    | 4    |
| A   | 3.291413 | NA       | 3.643191 | 1    | 4    |
| A   |       NA | NA       | 3.516456 | 1    | 4    |
| B   | 2.062035 | 4.632831 | 4.593559 | 2    | 5    |
| B   | 2.488496 | 1.806728 | 4.778701 | 2    | 5    |
| B   | 3.291413 | NA       | 3.643191 | 2    | 5    |
| B   |       NA | NA       | 3.516456 | 2    | 5    |
| C   | 2.062035 | 4.632831 | 4.593559 | 3    | 6    |
| C   | 2.488496 | 1.806728 | 4.778701 | 3    | 6    |
| C   | 3.291413 | NA       | 3.643191 | 3    | 6    |
| C   |       NA | NA       | 3.516456 | 3    | 6    |

In the end, I suppose all I’m trying to do is match unique lot IDs and then just replicate the values from df.ids for each of it’s columns into df.testing. I just don’t know how (well, I thought of doing it with a for() loop, but I’m sure there’s a better way)!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T00:18:12+00:00Added an answer on June 11, 2026 at 12:18 am

    You can just rearrange the order of the columns:

    > merge(df.ids, df.testing)
      lot var1 var2    test1    test2
    1   A    1    4 3.389578 11.55608
    2   A    1    4 9.502077 15.47929
    3   A    1    4 9.173870 17.18321
    4   B    2    5 4.349115 12.85377
    5   B    2    5 2.815137 14.45693
    6   B    2    5 6.947180 17.45857
    7   C    3    6 9.085507 17.92857
    8   C    3    6 6.155680 12.58901
    9   C    3    6 6.662026 19.92715
    
    > merge(df.ids, df.testing)[c(1,4:5,2:3)]
      lot    test1    test2 var1 var2
    1   A 3.389578 11.55608    1    4
    2   A 9.502077 15.47929    1    4
    3   A 9.173870 17.18321    1    4
    4   B 4.349115 12.85377    2    5
    5   B 2.815137 14.45693    2    5
    6   B 6.947180 17.45857    2    5
    7   C 9.085507 17.92857    3    6
    8   C 6.155680 12.58901    3    6  
    9   C 6.662026 19.92715    3    6
    

    This turns out to be an inner-join because there are no missing items in the merge-by columns, but the by.x and by.y parameters can be chosen so you get a left-outer or a right-outer join if needed. If for some reason you wanted the order of the items in df.testing to be maintained, you would need to create a column that got carried along, which would then be order()ed-by.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

UPDATE 6/21/12 I have a form in rails that is working similar to an
I have some code that I have inherited that generates a matrix of significance
I have an update trigger, part of which is below that is causing the
Update: Is there a way to achieve what I'm trying to do in an
UPDATE UPI_ATTRIBUTE SET SITE_INC ='0' WHERE USER_PROFILING_NAME IN ('CAR_IMPLICIT','CAR_EXPLICIT') Above is my query that
Frstly my apologies if this is a duplicate question. I have tried to find
Unsure if this should be on here or serverfault so apologies if I'm wrong.
My limited SQL knowledge prompted this post! I have a stored procedure that runs
My apologies for a silly question; just cannot come up with a proper term
I have a button on my windows form that calls the RunWorkerAsync() method, this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.