My question is about avoiding namespace pollution when writing modules in R.
Right now, in my R project, I have functions1.R with doFoo() and doBar(), functions2.R with other functions, and main.R with the main program in it, which first does source('functions1.R'); source('functions2.R'), and then calls the other functions.
I’ve been starting the program from the R GUI in Mac OS X, with source('main.R'). This is fine the first time, but after that, the variables that were defined the first time through the program are defined for the second time functions*.R are sourced, and so the functions get a whole bunch of extra variables defined.
I don’t want that! I want an “undefined variable” error when my function uses a variable it shouldn’t! Twice this has given me very late nights of debugging!
So how do other people deal with this sort of problem? Is there something like source(), but that makes an independent namespace that doesn’t fall through to the main one? Making a package seems like one solution, but it seems like a big pain in the butt compared to e.g. Python, where a source file is automatically a separate namespace.
Any tips? Thank you!
The main function you want to use is
sys.source(), which will load your functions/variables in a namespace (“environment” in R) other than the global one. One other thing you can do in R that is fantastic is to attach namespaces to yoursearch()path so that you need not reference the namespace directly. That is, if “namespace1” is on your search path, a function within it, say “fun1”, need not be called asnamespace1.fun1()as in Python, but asfun1(). [Method resolution order:] If there are many functions with the same name, the one in the environment that appears first in thesearch()list will be called. To call a function in a particular namespace explicitly, one of many possible syntaxes – albeit a bit ugly – isget("fun1","namespace1")(...)where...are the arguments tofun1(). This should also work with variables, using the syntaxget("var1","namespace1"). I do this all the time (I usually load just functions, but the distinction between functions and variables in R is small) so I’ve written a few convenience functions that loads from my~/.Rprofile.Example usage:
and so on, which will create two separate namespaces: “fun1” and “fun2”, which are attached to the
search()path (“fun2” will be higher on thesearch()list in this case). This is akin to doing something likemanually for each file (“2” is the default position on the
search()path). The way thatpopulate.env()is written, if a directory, say “functions/”, contains many R files without conflicting function names, you can call it asto load all functions (and variables) into a single namespace. With
name.to.env(), you can also do something likeor
Of course, if your project grows big and you have lots and lots of functions (and variables), writing a package is the way to go.