Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8067605
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T12:22:22+00:00 2026-06-05T12:22:22+00:00

I have a data.frame with 1,000 rows and 3 columns. It contains a large

  • 0

I have a data.frame with 1,000 rows and 3 columns. It contains a large number of duplicates and I’ve used plyr to combine the duplicate rows and add a count for each combination as explained in this thread.

Here’s an example of what I have now (I still also have the original data.frame with all of the duplicates if I need to start from there):

   name1    name2    name3     total
1  Bob      Fred     Sam       30
2  Bob      Joe      Frank     20
3  Frank    Sam      Tom       25
4  Sam      Tom      Frank     10
5  Fred     Bob      Sam       15

However, column order doesn’t matter. I just want to know how many rows have the same three entries, in any order. How can I combine the rows that contain the same entries, ignoring order? In this example I would want to combine rows 1 and 5, and rows 3 and 4.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T12:22:24+00:00Added an answer on June 5, 2026 at 12:22 pm

    Define another column that’s a “sorted paste” of the names, which would have the same value of “Bob~Fred~Sam” for rows 1 and 5. Then aggregate based on that.

    Brief code snippet (assumes original data frame is dd): it’s all really intuitive. We create a lookup column (take a look and should be self explanatory), get the sums of the total column for each combination, and then filter down to the unique combinations…

    dd$lookup=apply(dd[,c("name1","name2","name3")],1,
                                      function(x){paste(sort(x),collapse="~")})
    tab1=tapply(dd$total,dd$lookup,sum)
    ee=dd[match(unique(dd$lookup),dd$lookup),]
    ee$newtotal=as.numeric(tab1)[match(ee$lookup,names(tab1))]
    

    You now have in ee a set of unique rows and their corresponding total counts. Easy – and no external packages needed. And crucially, you can see at every stage of the process what is going on!

    (Minor update to help OP:) And if you want a cleaned-up version of the final answer:

    outdf = with(ee,data.frame(name1,name2,name3,
                               total=newtotal,stringsAsFactors=FALSE))
    

    This gives you a neat data frame with the three all-important name columns, and with the aggregated totals in a column called total rather than newtotal.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a data frame with 900,000 rows and 11 columns in R. The
I have a data frame with 900,000 rows and 11 columns in R. The
I have a data frame that is some 35,000 rows, by 7 columns. it
SQLDF newbie here. I have a data frame which has about 15,000 rows and
I have a dataframe with approximately 500,000 rows and four columns. The dataframe contains
I have a data.frame named d of ~1,300,000 lines and 4 columns and another
I have a data frame like below (20,000 rows by 49 cols). Each row
I have a data.frame with a column with values ranging from 0 to 50.000.
I have a data frame composed of numeric and non-numeric columns. I would like
I have a data frame in R, and I'd like to add dummy variables

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.