When I undertake an R project of any complexity, my scripts quickly get long and confusing.
What are some practices I can adopt so that my code will always be a pleasure to work with? I’m thinking about things like
- Placement of functions in source files
- When to break something out to another source file
- What should be in the master file
- Using functions as organizational units (whether this is worthwhile given that R makes it hard to access global state)
- Indentation / line break practices.
- Treat ( like {?
- Put things like )} on 1 or 2 lines?
Basically, what are your rules of thumb for organizing large R scripts?
The standard answer is to use packages — see the Writing R Extensions manual as well as different tutorials on the web.
It gives you
R CMD checkJust running
source()over code works for really short snippets. Everything else should be in a package — even if you do not plan to publish it as you can write internal packages for internal repositories.As for the ‘how to edit’ part, the R Internals manual has excellent R coding standards in Section 6. Otherwise, I tend to use defaults in Emacs’ ESS mode.
Update 2008-Aug-13: David Smith just blogged about the Google R Style Guide.