I need to work with some databases read with read.table from csv (comma separated values ), and I wish to know how to compute the size of the allocated memory for each type of variable.
How to do it ?
edit — in other words : how much memory R allocs for a general data frame read from a .csv file ?
You can get the amount of memory allocated to an object with
object.size. For example:This script might also be helpful- it lets you view or graph the amount of memory used by all of your current objects.
In answer to your question of why
object.size(4)is 48 bytes, the reason is that there is some overhead in each numeric vector. (InR, the number4is not just an integer as in other languages- it is a numeric vector of length 1). But that doesn’t hurt performance, because the overhead does not increase with the size of the vector. If you try:This shows you that each integer itself requires only 4 bytes (as you expect).
Thus, summary:
For a numeric vector of length
n, the size in bytes is typically40 + 8 * floor(n / 2). However, on my version of R and OS there is a single slight discontinuity, where it jumps to 168 bytes faster than you would expect (see plot below). Beyond that, the linear relationship holds, even up to a vector of length 10000000.For a categorical variable, you can see a very similar linear trend, though with a bit more overhead (see below). Outside of a few slight discontinuities, the relationship is quite close to
400 + 60 * n.