I have two related questions — I’m trying to learn R properly, so I’m doing some homework problems from an R course. They have us writing a function to return a vector of correlations:
example.function <- function(threshold = 0) {
example.vector <- vector()
example.vector <- sapply(1:30, function(i) {
complete.record.count <- # ... counts the complete records in each of the 30 files.
## Cutting for space and to avoid giving away answers.
## a few lines get the complete records in each
## file and count them.
if(complete.record.count > threshold) {
new.correlation <- cor(complete.record$val1, complete.record$val2)
print(new.correlation)
example.vector <- c(new.correlation, example.vector)
}
})
# more null value handling#
return(example.vector)
}
As the function runs it prints the correlation value to stdout. The values it prints are accurate to six decimal points. So I know I’m getting a good value for new.correlation. The vector that is returned doesn’t include those values. Instead, it is whole numbers in sequence.
> tmp <- example.function()
> head(tmp)
[1] 2 3 4 5 6 7
I can’t figure out why sapply is pushing integers into the vector? What am I missing here?
I actually don’t understand the core structure, which is more or less:
some.vector <- vector()
some.vector <- sapply(range, function(i) {
some.vector <- c(new.value,some.vector)
}
that seems awfully un-R-like in its redundancy. Tips?
If you use
sapplyyou don’t need to create the vector yourself and you don’t need to grow it (sapplytakes care of all that). You probably want something like this:However, it is unclear how the index
ifactors into the anonymous function and the question is not reproducible …