I’ve got two long lists A and B which have the same length but contain different numbers of equivalent elements:
List A can contain many elements which also can recur in the same field.
List B either contains only one element or an empty field, i.e. “character(0)”.
A also contains some empty fields but for these records there’s always an element present in B, so there are no records with empty fields in A and B.
I want to combine the elements of A and B into a new list of the same length, C, according to the following rules:
- All elements from A have to be present in C – including their potential recurrences in the same field.
- If B contains an element which isn’t already present in A of the same record it’ll be added to C as well.
- But if B contains an element which already is present in A of the same record it’ll be ignored.
- If A has an empty field the element from B for this record will be added to C.
- If B has an empty field the element(s) from A for this record will be added to C.
This is an example of how these lists begin:
> A
[1] "JAMES" "JAMES"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[4] character(0)
...
> B
[1] "RICHARD"
[2] "JOHN"
[3] character(0)
[4] "CHARLES"
...
This is the correct output I’m looking for:
> C
[1] "JAMES" "JAMES" "RICHARD"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "WILLIAM" "DAVID" "WILLIAM"
[4] "CHARLES"
...
I tried, e.g.:
C <- sapply(mapply(union, A,B), setdiff, character(0))
But this deleted the recurrences from A, unfortunately:
> C
[1] "JAMES" "RICHARD"
[2] "JOHN" "ROBERT"
[3] "WILLIAM" "MICHAEL" "DAVID"
[4] "CHARLES"
...
Can anybody tell me, please, how to combine these two lists, preserve the recurrences from A, and achieve the output I desire?
Thank you very much in advance!
Update: Machine readable data:
A <- list(c("JAMES","JAMES"),
c("JOHN","ROBERT"),
c("WILLIAM","MICHAEL","WILLIAM","DAVID","WILLIAM"),
character(0))
B <- list("RICHARD","JOHN",character(0),"CHARLES")
Here is your snippte of data, in reproducible form:
You were close with
mapply(). I got the desired output by usingc()to concatenate the list elements inAandBbut had to manipulate elements of the supplied vectors, so I came up with this:We can refer to the individual elements of
...using the..nplaceholders;..1isAand..2isB. Of course,foo()only works with two lists but doesn’t enforce this or do any checking, just to keep things simple.foo()also needs to handle the cases where eitherAorBor both arecharacter(0)which I now thinkfoo()does.When we use that in the
mapply()call I get:An
lapply()version may be more meaningful than the abstract..nbut uses essentially the same code. Here is a new function that works withAandBdirectly but we iterate over the indices of the elements ofA(1, 2, 3, length(A)) as generated byseq_along():which is called like this: