I am trying to use colsplit to break up a vector in a dataframe.

Question

0

Asked: May 27, 20262026-05-27T09:17:54+00:00 2026-05-27T09:17:54+00:00

I am trying to use colsplit to break up a vector in a dataframe.

0

I am trying to use colsplit to break up a vector in a dataframe. The fact that we have regular expression as an arg to colsplit makes me think it can be flexible, but I am having trouble (it might just be that I’m not understanding regex in R).

Here’s the problem:

let’s create a vector…

> library(reshape)
> my_var_1 <- factor(c("x00_aaa_123","x00_bbb_123","x00_ccc_123","x01_aaa_123","x01_bbb_123","x01_ccc_123","x02_aaa_123","x02_bbb_123","x02_ccc_123"))

I would like to split it into two columns upon the first underscore.
In other words, I want my end result to be this…

    x whatever
1 x00  aaa_123
2 x00  bbb_123
3 x00  ccc_123
4 x01  aaa_123
5 x01  bbb_123
6 x01  ccc_123
7 x02  aaa_123
8 x02  bbb_123
9 x02  ccc_123

I am trying to find the right regex inside of colspan that will do it, but no luck. Here’s the closest I can get…

> colsplit(my_var_1, split="_", c("x","whatever")) 
    x whatever NA.
1 x00      aaa 123
2 x00      bbb 123
3 x00      ccc 123
4 x01      aaa 123
5 x01      bbb 123
6 x01      ccc 123
7 x02      aaa 123
8 x02      bbb 123
9 x02      ccc 123

That uses the split regex as a simple delimiter and it gives me three columns. I would like to not split the second underscore (to make it worse, in my real data I have an arbitrary number of underscores not just two).

Is there an expression I can use for “split” that will give what I want?

I had hoped that the regex in colsplit would allow me to match on groups and the group matches would be the content of splits but that does not appear to be the case.

* edit (thanks to @Joshuaulrich) colsplit works “as intended” when using the newer reshape2 !!!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T09:17:55+00:00

Your code throws an error for me:

> colsplit(my_var_1, split="_", c("x","whatever"))
Error in colsplit(my_var_1, split = "_", c("x", "whatever")) : 
  unused argument(s) (split = "_")

split isn’t an argument to colsplit. The argument you want is pattern, or you can just rely on positional matching:

> colsplit(my_var_1, "_", c("x","whatever"))
    x whatever
1 x00  aaa_123
2 x00  bbb_123
3 x00  ccc_123
4 x01  aaa_123
5 x01  bbb_123
6 x01  ccc_123
7 x02  aaa_123
8 x02  bbb_123
9 x02  ccc_123

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to use colsplit to break up a vector in a dataframe.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply