I have a dataframe such as: lat lon var01 var02 var03 var04 var11 var12

Question

0

Asked: May 30, 20262026-05-30T16:21:41+00:00 2026-05-30T16:21:41+00:00

I have a dataframe such as: lat lon var01 var02 var03 var04 var11 var12

0

I have a dataframe such as:

lat lon var01 var02 var03 var04 var11 var12 var13 var14 ...

and another one like:

lat lon var05 var15 var25 ...

The required output is:

lat lon var01 var02 var03 var04 var05 var11 var12 var13 var14 var15 ...

I thought this would be easy in R, but I haven’t found any way so far. I’m also open to solutions in other languages like bash. I would also like to have only a few lines of code, I know how to do it with loops and such.

Thanks in advance

Edit: The following solution requires that the columns are named correctly. Imagine the following situation:

d1 <- data.frame(lat = 1:10, lon = 1:10, V11 = runif(10), V12 = rnorm(10), V21 = runif(10), V22 = rnorm(10)) 
d2 <- data.frame(lat = 1:10, lon = 1:10, A13 = runif(10), A23 = rnorm(10)) 
res <- merge(d1, d2, sort = FALSE) 
res <- res[, c(1:2, order(colnames(res[, -(1:2)])) + 2)]

The output is

lat lon        A13        A23        V11        V12        V21        V22
 10  10 0.21269952  0.2670988 0.87532133 -0.6887557 0.60493329 -0.1350546
  1   1 0.61464497 -0.5686687 0.91287592 -0.4149946 0.23962942  0.3981059
  2   2 0.55715954 -0.1351786 0.29360337 -0.3942900 0.05893438 -0.6120264
  3   3 0.32877732  1.1780870 0.45906573 -0.0593134 0.64228826  0.3411197
  4   4 0.45313145 -1.5235668 0.33239467  1.1000254 0.87626921 -1.1293631
  5   5 0.50044097  0.5939462 0.65087047  0.7631757 0.77891468  1.4330237
  6   6 0.18086636  0.3329504 0.25801678 -0.1645236 0.79730883  1.9803999  
  7   7 0.52963060  1.0630998 0.47854525 -0.2533617 0.45527445 -0.3672215
  8   8 0.07527575 -0.3041839 0.76631067  0.6969634 0.41008408 -1.0441346
  9   9 0.27775593  0.3700188 0.08424691  0.5566632 0.81087024  0.5697196

and the required output is:

lat lon V11 V12 A13 V21 V22 A13

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T16:21:42+00:00

merge() is a suitable tool for this job. Here is an example:

set.seed(1)
d1 <- data.frame(lat = 1:10, lon = 1:10, V2 = runif(10), V4 = rnorm(10))
d2 <- data.frame(lat = 1:10, lon = 1:10, V1 = runif(10), V3 = rnorm(10))

## merge the data using `lat` and `lon`
res <- merge(d1, d2, sort = FALSE) ## `sort = FALSE` stops R reordering rows

## get columns in right order
res <- res[, c(1:2, order(colnames(res[, -(1:2)])) + 2)]

Which gives:

> res
   lat lon        V1         V2          V3         V4
1    1   1 0.4820801 0.26550866  0.91897737 -0.8204684
2    2   2 0.5995658 0.37212390  0.78213630  0.4874291
3    3   3 0.4935413 0.57285336  0.07456498  0.7383247
4    4   4 0.1862176 0.90820779 -1.98935170  0.5757814
5    5   5 0.8273733 0.20168193  0.61982575 -0.3053884
6    6   6 0.6684667 0.89838968 -0.05612874  1.5117812
7    7   7 0.7942399 0.94467527 -0.15579551  0.3898432
8    8   8 0.1079436 0.66079779 -1.47075238 -0.6212406
9    9   9 0.7237109 0.62911404 -0.47815006 -2.2146999
10  10  10 0.4112744 0.06178627  0.41794156  1.1249309

Update based on revised Q:

## dummy data
set.seed(1)
df3 <- data.frame(matrix(runif(60), ncol = 6))
names(df3) <- paste("df3Var", 1:6, sep = "")
df3 <- cbind.data.frame(lat = 1:10, lon = 1:10, df3)
df4 <- data.frame(matrix(runif(30), ncol = 3))
names(df4) <- paste("df4Var", 1:3, sep = "")
df4 <- cbind.data.frame(lat = 1:10, lon = 1:10, df4)

## merge
res2 <- merge(df3, df4, sort = FALSE)

This gives:

> head(res2)
  lat lon   df3Var1   df3Var2   df3Var3   df3Var4   df3Var5    df3Var6
1   1   1 0.2655087 0.2059746 0.9347052 0.4820801 0.8209463 0.47761962
2   2   2 0.3721239 0.1765568 0.2121425 0.5995658 0.6470602 0.86120948
3   3   3 0.5728534 0.6870228 0.6516738 0.4935413 0.7829328 0.43809711
4   4   4 0.9082078 0.3841037 0.1255551 0.1862176 0.5530363 0.24479728
5   5   5 0.2016819 0.7698414 0.2672207 0.8273733 0.5297196 0.07067905
6   6   6 0.8983897 0.4976992 0.3861141 0.6684667 0.7893562 0.09946616
    df4Var1   df4Var2   df4Var3
1 0.9128759 0.3390729 0.4346595
2 0.2936034 0.8394404 0.7125147
3 0.4590657 0.3466835 0.3999944
4 0.3323947 0.3337749 0.3253522
5 0.6508705 0.4763512 0.7570871
6 0.2580168 0.8921983 0.2026923
> names(res2)
 [1] "lat"     "lon"     "df3Var1" "df3Var2" "df3Var3" "df3Var4" "df3Var5"
 [8] "df3Var6" "df4Var1" "df4Var2" "df4Var3"

OK, so now note the ordering. Assume we want to take variables in groups of 2 from df3 with 1 variable from df4 and within each of df3 and df4 the variables are in the correct order within themselves. For this we need to create an index vector ord that is:

> ord
[1] 1 2 7 3 4 8 5 6 9

which we then add 2 too (to cover the lat and lon columns in the merged data frame)

> ord + 2
[1]  3  4  9  5  6 10  7  8 11

Once you have the sequence, we just need a way to use R’s vectorised tools and a tiny bit of math to produce the sequence we want. I build the index up in two stages; i) first I work out where the columns (1:6) + 2 of the merged data frame should be in ord, and then ii) I fill in the remaining spaces with the indexes in the merged data frame of the columns from the second data frame.

ord <- numeric(length = sum(ncol(df3), ncol(df4)) - 4)
ngrps <- 3
ningrps <- 2
## i)
want <- rep(seq_len(ningrps), ngrps) + 
    rep(seq(from = 0, by = 3, length = prod(ngrps, ningrps) / 2), 
        each = ningrps)
ord[want] <- seq_len(prod(ngrps, ningrps))
## ii)
want <- ngrps * seq_len(ngrps)
ord[want] <- seq(to = sum(ncol(df3), ncol(df4)) - 4, by = 1, length = ngrps)
res3 <- res2[, c(1:2, ord+2)]

That gives:

> head(res3)
  lat lon   df3Var1   df3Var2   df4Var1   df3Var3   df3Var4   df4Var2   df3Var5
1   1   1 0.2655087 0.2059746 0.9128759 0.9347052 0.4820801 0.3390729 0.8209463
2   2   2 0.3721239 0.1765568 0.2936034 0.2121425 0.5995658 0.8394404 0.6470602
3   3   3 0.5728534 0.6870228 0.4590657 0.6516738 0.4935413 0.3466835 0.7829328
4   4   4 0.9082078 0.3841037 0.3323947 0.1255551 0.1862176 0.3337749 0.5530363
5   5   5 0.2016819 0.7698414 0.6508705 0.2672207 0.8273733 0.4763512 0.5297196
6   6   6 0.8983897 0.4976992 0.2580168 0.3861141 0.6684667 0.8921983 0.7893562
     df3Var6   df4Var3
1 0.47761962 0.4346595
2 0.86120948 0.7125147
3 0.43809711 0.3999944
4 0.24479728 0.3253522
5 0.07067905 0.7570871
6 0.09946616 0.2026923

which is the ordering you wanted. Now we can cook that into a little function:

myMerge <- function(x, y, ngrps, ningrps, ...) {
    out <- merge(x, y, ...)
    ncols <- ncol(out) - 2
    ord <- numeric(length = ncols)
    want <- rep(seq_len(ningrps), ngrps) + 
        rep(seq(from = 0, by = ngrps, length = prod(ngrps, ningrps) / 2), 
            each = ningrps)
    ord[want] <- seq_len(prod(ngrps, ningrps))
    want <- ngrps * seq_len(ngrps)
    ord[want] <- seq(to = ncols, by = 1, length = ngrps)
    out <- out[, c(1:2, ord+2)]
    out
}

Which when used on df3 and df4 above gives:

> myMerge(df3, df4, ngrps = 3, ningrps = 2, sort = FALSE)
   lat lon    df3Var1   df3Var2    df4Var1    df3Var3   df3Var4   df4Var2
1    1   1 0.26550866 0.2059746 0.91287592 0.93470523 0.4820801 0.3390729
2    2   2 0.37212390 0.1765568 0.29360337 0.21214252 0.5995658 0.8394404
3    3   3 0.57285336 0.6870228 0.45906573 0.65167377 0.4935413 0.3466835
4    4   4 0.90820779 0.3841037 0.33239467 0.12555510 0.1862176 0.3337749
5    5   5 0.20168193 0.7698414 0.65087047 0.26722067 0.8273733 0.4763512
6    6   6 0.89838968 0.4976992 0.25801678 0.38611409 0.6684667 0.8921983
7    7   7 0.94467527 0.7176185 0.47854525 0.01339033 0.7942399 0.8643395
8    8   8 0.66079779 0.9919061 0.76631067 0.38238796 0.1079436 0.3899895
9    9   9 0.62911404 0.3800352 0.08424691 0.86969085 0.7237109 0.7773207
10  10  10 0.06178627 0.7774452 0.87532133 0.34034900 0.4112744 0.9606180
     df3Var5    df3Var6   df4Var3
1  0.8209463 0.47761962 0.4346595
2  0.6470602 0.86120948 0.7125147
3  0.7829328 0.43809711 0.3999944
4  0.5530363 0.24479728 0.3253522
5  0.5297196 0.07067905 0.7570871
6  0.7893562 0.09946616 0.2026923
7  0.0233312 0.31627171 0.7111212
8  0.4772301 0.51863426 0.1216919
9  0.7323137 0.66200508 0.2454885
10 0.6927316 0.40683019 0.1433044

Which is again what you wanted. You could fiddle with the function definition so you don’t need to specify both ngrps and ningrps as you can work one out from the other plus the number of columns in df3 – 2. But I’ll leave that as an exercise for the reader.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a dataframe such as: lat lon var01 var02 var03 var04 var11 var12

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply