I’m working in R with the following dataset for a metabolomics study. first Name

Question

0

Asked: June 15, 20262026-06-15T10:58:51+00:00 2026-06-15T10:58:51+00:00

I’m working in R with the following dataset for a metabolomics study. first Name

0

I’m working in R with the following dataset for a metabolomics study.

first Name      Area    Sample Similarity

120   Pentanone 699468  PO4:1   954

120   Pentanone 153744  PO2:1   981

126   Methylamine 83528 PO4:1   887

126   Unknown     32741 PO2:1   645

126   Sulfurous 43634   PO1:1   800

I want to be able to selected in the first column, within the rowns with same value (for example 120), the compounds with same name (for example pentanone). From this selection I want to copy the row information that corresponds to the highest similarity and created new columns within the table. In this case the following information:

120 Pentanone   153744  PO2:1   981

I know that “send me the code posts” are not very appreciated by I would greatly appreciated some clues on how to start.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T10:58:52+00:00

There are many options. You already have one example using plyr; here are two more.

Base R approach, using aggregate and merge:

merge(dat, aggregate(Similarity ~ first + Name, dat, max))
#   first        Name Similarity   Area Sample
# 1   120   Pentanone        981 153744  PO2:1
# 2   126 Methylamine        887  83528  PO4:1
# 3   126   Sulfurous        800  43634  PO1:1
# 4   126     Unknown        645  32741  PO2:1

A sqldf approach:

library(sqldf)
sqldf("select *, max(Similarity) `Similarity` from dat group by first, Name")
#   first        Name Similarity   Area Sample
# 1   120   Pentanone        981 153744  PO2:1
# 2   126 Methylamine        887  83528  PO4:1
# 3   126   Sulfurous        800  43634  PO1:1
# 4   126     Unknown        645  32741  PO2:1

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m working in R with the following dataset for a metabolomics study. first Name

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply