Is there a fast and clever way that would , lets say from DF

Question

0

Asked: May 23, 20262026-05-23T02:06:36+00:00 2026-05-23T02:06:36+00:00

Is there a fast and clever way that would , lets say from DF

0

Is there a fast and clever way that would , lets say from DF like this

vec <- data.frame(Names = c("var1","var2","var3","var4","var5","var6","var7",
                            "var8","var9","var10","var11","var12","var13",
                            "var14") ,
                  phase1= runif(14),
                  phase1.away= runif(14),
                  phase1_in= runif(14),
                  phase1_out= runif(14),
                  phase1.1= runif(14),
                  phase1.away.1= runif(14),
                  phase1_in.1= runif(14),
                  phase1_out.1= runif(14),
                  phase1.2= runif(14),
                  phase1.away.2= runif(14),
                  phase1_in.2= runif(14),
                  phase1_out.2= runif(14))

give a new DF as this:

-allways order according phase1.x , give the names of variables corresponding to the values, phase1_in and phase1_out values but not phase1.away.

What I am doing is simply

vec.o<-vec[with(vec, order(-phase1)),]
d1<-vec.o[c("Names","phase1","phase1_in","phase1_out")]

vec.o<-vec[with(vec, order(-phase1.1)),]
d2<-vec.o[c("Names","phase1.1","phase1_in.1","phase1_out.1")]

cbind(d1,d2)

which is extremely boring and I am also sure anti R-ish. Any clever ideas? I am dealing with large data frames permanently and R seems to be
a bit cumbersome. Is there any good literature one would reccomend for these purposes?
(load many variables, create names to them, operations with those variables etc…, )

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T02:06:37+00:00

EDIT : corrected for the case phase.x goes to 10 and higher.

I presume you have quite a lot more than phase1.1, phase1.2, so a general solution using regular expressions would be something along the lines of :

# Make an id vector for the phase1.x, and give Names id -1
# gives a warning as character is transformed to NA
id <- as.numeric(gsub(".*\\.(\\d+$)","\\1",names(vec)))
id[1] <- -1
id[is.na(id)] <- 0 # first occurence, no .x


veclist <- lapply(unique(id)[-1],function(i){
    #select all variables necessary, exclude the away
    out <- vec[id %in% c(i,-1) & !grepl("away",names(vec))]
    # find the phase1.x for ordering
    ovec <- grepl("phase1(\\.\\d+)?$",names(out))
    # order and produce
    out[order(out[,ovec]),]
})

do.call(cbind,veclist)

It is based on recognition of the last number preceded by a dot, and cuts that out. If there is no last number preceded by a dot, it’s either the Names variable (which I indicate with -1), or the first phase (which I indicate with 0).

Now you have an id vector that can easily select the variables that belong together, so you can loop over the unique values of id, except the first (being -1). Using regular expressions again you can get whatever variable you want for the construction of a new dataframe. The do.call on the end combines all those dataframes again.

Btw, Ordering sub-dataframes goes quite a lot faster than ordering the original dataframe first and then selecting your variables. This is the gain you have in the solution of nullglob.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Is there a fast and clever way that would , lets say from DF

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply