I have a large dataframe with classification information. Here is an example: > d

Question

0

Asked: May 26, 20262026-05-26T04:52:45+00:00 2026-05-26T04:52:45+00:00

I have a large dataframe with classification information. Here is an example: > d

0

I have a large dataframe with classification information. Here is an example:

> d <- data.frame(x = c(1,2,3,4), classification = c("cl1.scl1", "cl2", "cl3-bla", "cl4.subclass2"))
> d
  x classification
1 1       cl1.scl1
2 2            cl2
3 3        cl3-bla
4 4  cl4.subclass2

Before I do any further processing I need to aggregate the classification information, which means that I have to split the classification strings by “.” and take the first token. This is the result I need:

> d
  x classification
1 1            cl1
2 2            cl2
3 3        cl3-bla
4 4            cl4

At the moment I am computing this as follows:

d$classification = unlist(lapply(d$classification, function (x) strsplit(as.character(x), ".", fixed=TRUE)[[1]][1]))

This works, but it took me quite a while to figure this out. I assume there is a more elegant solution, which I probably missed. Any suggestions? Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T04:52:45+00:00

Editorial Team

2026-05-26T04:52:45+00:00Added an answer on May 26, 2026 at 4:52 am

You can use regular expressions with back-references.

gsub("(.*)\\.(.*)","\\1",d$classification)

There are 2 references (the portions of the regular expression in parenthesis), separated by a literal period. We replace whatever matches that pattern with the contents of the first reference.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large dataframe with classification information. Here is an example: > d

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply