Sometimes I want to perform a function (eg difference calculation) on a dataset and store the results directly in the data frame
df <- data.frame(a$C, diff(a$C))
But I cannot do that because the number of rows is different.
Is there some syntax that will allow me to to that, perhaps having NA when the function (diff()) gives no results?
There isn’t a general solution to this without making vast assumptions about the whole panoply of function one may wish to use.
For the example you show, we can easily work out that the first value from
diff()would be anNAif it returned it:So if you are using
diff()then you can always just do:But now consider what you would do with any other function that you might wish to use that doesn’t always return
NAin the correct place.For this example, we can use the zoo package which has an
na.padargument:If you are using a modelling function with a formula interface (e.g.
lm()) and that function has anna.actionargument, then you can setna.action = na.excludein the function call and extractor functions such asfitted(),resid()etc will add back in to their outputNAin the correct places so that the output is of the same length as the data passed to the modelling function.If you have other more specific cases you want to explore, please edit your Answer. In specific cases there will usually be a simple Answer to your Q. In the general case the Answer is no, it is not possible to do what you ask.