I have a quite large data frame, about 10 millions of rows. It has columns x and y, and what I want is to compute
hypot <- function(x) {sqrt(x[1]^2 + x[2]^2)}
for each row. Using apply it would take a lot of time (about 5 minutes, interpolating from lower sizes) and memory.
But it seems to be too much for me, so I’ve tried different things:
- compiling the
hypotfunction reduces the time by about 10% - using functions from
plyrgreatly increases the running time.
What’s the fastest way to do this thing?
What about
with(my_data,sqrt(x^2+y^2))?Two different per-line functions, one taking advantage of vectorization:
Try compiling these too:
Results:
As expected the
with()solution and the column-indexing solution à la Tyler Rinker are essentially identical;hypot2is twice as fast as the originalhypot(but still about 150 times slower than the vectorized solutions). As already pointed out by the OP, compilation doesn’t help very much.