A New Year’s quandary for the stackoverflow community which has been quite the help by reading posts and answers in the past (this is my first question). I’ve found a work around, but I’m wondering if other approaches/solutions might be suggested.
I am attempting to remove trailing NA’s from a large data.frame, but those NA’s are only found in a few of the columns of the data.frame and I would like to retain all columns in the output. Here is a representative data subset.
df=data.frame(var1=rep("A", 8), var2=c("a","b","c","d","e","f","g","h"), var3=c(0,1,NA,2,3,NA,NA,NA), var4=c(0,0,NA,4,5,NA,NA,NA), var5=c(0,0,NA,0,2,4,NA,NA))
Goals of the process:
- Trim trailing NAs based on NA presence in var3,var4 and var5
- Retain all columns in final output
- Only remove trailing NAs (i.e. row 3 remains in record as a placeholder)
- Only trim if all columns have an NA (i.e. row 7 and 8, but not row 6)
Based on these goals, the solution should remove the last two rows of df:
df.output = df[-c(7,8),]
The behaviour of na.trim (in the zoo package) is ideal (as it limits removal to those NA’s at the end of the data.frame, with sides=”right”), and my work-around involved altering the na.trim.default function to include a subset term.
Any suggestions? Many thanks for any help.
EDIT: Just to complete this question, below is the function I created from the na.trim.default code which also works, but as noted, does require loading the zoo package.
na.trim.multiplecols <- function (object, colrange, sides = c("both", "left", "right"), is.na = c("any","all"),...)
{
is.na <- match.arg(is.na)
nisna <- if (is.na == "any" || length(dim(object[,colrange])) < 1) {
complete.cases(object[,colrange])
}
else rowSums(!is.na(object[,colrange])) > 0
idx <- switch(match.arg(sides), left = cumsum(nisna) > 0,
right = rev(cumsum(rev(nisna) > 0) > 0), both = (cumsum(nisna) >
0) & rev(cumsum(rev(nisna)) > 0))
if (length(dim(object)) < 2)
object[idx]
else object[idx, , drop = FALSE]
}
Edit:First solution using baserleandapplySecond solution using
Rlefrom packageIRanges: