I have a data frame that looks like this: _______id text 1 | 7821

Question

0

Asked: June 18, 20262026-06-18T08:36:53+00:00 2026-06-18T08:36:53+00:00

I have a data frame that looks like this: _______id text 1 | 7821

0

I have a data frame that looks like this:

_________________id ________________text______
    1   | 7821             | "some text here"
    2   | 7821             |  "here as well"
    3   | 7821             |  "and here"
    4   | 567              |   "etcetera"
    5   | 567              |    "more text"
    6   | 231              |   "other text"

And I would like to group the texts by IDs, so I can run a clustering algorithm:

________________id___________________text______
    1   | 7821             | "some text here here as well and here"
    2   | 567              |   "etcetera more text"
    3   | 231              |   "other text"

Is there any way to do this? I am importing from a database table and I have a lot of data, so I can’t do it manually.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T08:36:54+00:00

You’re actually looking for aggregate, not merge, and there should be many examples on SO demonstrating different options for aggregation. Here’s the most basic and direct approach, using the formula approach to specify which columns to aggregate.

Here’s your data in a copy-and-pasteable form

mydata <- structure(list(id = c(7821L, 7821L, 7821L, 567L, 567L, 231L), 
    text = structure(c(6L, 3L, 1L, 2L, 4L, 5L), .Label = c("and here", 
    "etcetera", "here as well", "more text", "other text", "some text here"
    ), class = "factor")), .Names = c("id", "text"), class = "data.frame", 
    row.names = c(NA, -6L))

Here’s the aggregated output.

aggregate(text ~ id, mydata, paste, collapse = " ")
#     id                                 text
# 1  231                           other text
# 2  567                   etcetera more text
# 3 7821 some text here here as well and here

Of course, there is also data.table, which has nice compact syntax (and awesome speed):

> library(data.table)
> DT <- data.table(mydata)
> DT[, paste(text, collapse = " "), by = "id"]
     id                                   V1
1: 7821 some text here here as well and here
2:  567                   etcetera more text
3:  231                           other text

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a data frame that looks like this: _________________id ________________text______ 1 | 7821

Leave an answerCancel reply

1 Answer

I have a data frame that looks like this: _______id text 1 | 7821

Leave an answer
Cancel reply