I have some calculation going on and get the following warning (i.e. not an error):
Warning messages:
1: In sum(myvar, na.rm = T) :
Integer overflow - use sum(as.numeric(.))
In this thread people state that integer overflows simply don’t happen. Either R isn’t overly modern or they are not right. However, what am I supposed to do here? If I use as.numeric as the warning suggests I might not account for the fact that information is lost way before. myvar is read form a .csv file, so shouldn’t R figure out that some bigger field is needed? Does it already cut off something?
What’s the max length of integer or numeric? Would you suggest any other field type / mode?
EDIT: I run:
R version 2.13.2 (2011-09-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) within R Studio
You can answer many of your questions by reading the help page
?integer. It says:Expanding to larger integers is under consideration by R Core but it’s not going to happen in the near future.
If you want a “bignum” capacity then install Martin Maechler’s Rmpfr package [PDF]. I recommend the ‘Rmpfr’ package because of its author’s reputation. Martin Maechler is also heavily involved with the Matrix package development, and in R Core as well. There are alternatives, including arithmetic packages such as ‘gmp’, ‘Brobdingnag’ and ‘Ryacas’ package (the latter also offers a symbolic math interface).
Next, to respond to the critical comments in the answer you linked to, and how to assess the relevance to your work, consider this: If there were the same statistical functionality available in one of those “modern” languages as there is in R, you would probably see a user migration in that direction. But I would say that migration, and certainly growth, is in the R direction at the moment. R was built by statisticians for statistics.
There was at one time a Lisp variant with a statistics package, Xlisp-Stat, but its main developer and proponent is now a member of R-Core. On the other hand one of the earliest R developers, Ross Ihaka, suggests working toward development in a Lisp-like language [PDF]. There is a compiled language called Clojure (pronounced as English speakers would say “closure”) with an experimental interface, Rincanter.
Update:
The new versions of R (3.0.+) has 53 bit integers of a sort (using the
numericmantissa). When an “integer” vector element is assigned a value in excess of ‘.Machine$integer.max’, the entire vector is coerced to “numeric”, a.k.a. “double”. Maximum value forintegersremains as it was, however, there may be coercion of integer vectors to doubles to preserve accuracy in cases that would formerly generate overflow. Unfortunately, the length of lists, matrix and array dimensions, and vectors is still set atinteger.max.When reading in large values from files, it is probably safer to use character-class as the target and then manipulate. If there is coercion to NA values, there will be a warning.