Modules in R

2024-01-20

Provides modules as an organizational unit for source code. Modules enforce to be more rigorous when defining dependencies and have a local search path. They can be used as a sub unit within packages or in scripts.

Installation

From CRAN:

install.packages("modules")

From GitHub:

if (require("devtools")) install_github("wahani/modules")

Introduction

The key idea of this package is to provide a unit of source code which has it’s own scope. The main and most reliable infrastructure for such organizational units in the R ecosystem is a package. Modules can be used as stand alone, ad-hoc substitutes for a package or as a sub-unit within a package.

When modules are defined inside of packages they act as bags of functions (like objects as in object-oriented-programming). Outside of packages modules define entities which only know of the base environment, i.e. within a module the base environment is the only package on the search path. Also they are always represented as a list inside R.

Scoping of modules

We can create a module using the modules::module function. A module is similar to a function definition; it comprises:

Similar to a function you may supply arguments to a module; see the vignette on modules as objects on this topic.

To illustrate the very basic functionality of a module, consider the following example:

library("modules")
m <- module({
  foo <- function() "foo"
})
m$foo()
## [1] "foo"

Here m is the collection of objects created inside the module. This is a list with the function foo as only element. We can do the same thing and define a module in a separate file:

module.R

foo <- function() "foo"

main.R

m <- modules::use("module.R")
m$foo()
## [1] "foo"

The two examples illustrate the two ways in which modules can be constructed. Since modules are isolated from the .GlobalEnv the following object x can not be found:

x <- "hey"
m <- module({
  someFunction <- function() x
})
m$someFunction()
## Error in m$someFunction(): object 'x' not found
getSearchPathContent(m)
## List of 4
##  $ modules:root     : chr "someFunction"
##  $ modules:internals: chr [1:10] "attach" "depend" "export" "expose" ...
##  $ base             : chr [1:1268] "!" "!.hexmode" "!.octmode" "!=" ...
##  $ R_EmptyEnv       : chr(0) 
##  - attr(*, "class")= chr [1:2] "SearchPathContent" "list"

Two features of modules are important at this point:

The following subsections explain how to work with these two features.

Imports

If you rely on exported objects of a package you can refer to them explicitly using :::

m <- module({
  functionWithDep <- function(x) stats::median(x)
})
m$functionWithDep(1:10)
## [1] 5.5

Or you can use import for attaching single objects or packages. Import acts as a substitute for library with an important difference: library has the side effect of changing the search path of the complete R session. import only changes the search path of the calling environment, i.e. the side effect is local to the module and does not affect the global state of the R session.

m <- module({
  import("stats", "median") # make median from package stats available

  functionWithDep <- function(x) median(x)
})
m$functionWithDep(1:10)
## [1] 5.5
getSearchPathContent(m)
## List of 5
##  $ modules:root     : chr "functionWithDep"
##  $ modules:stats    : chr "median"
##  $ modules:internals: chr [1:10] "attach" "depend" "export" "expose" ...
##  $ base             : chr [1:1268] "!" "!.hexmode" "!.octmode" "!=" ...
##  $ R_EmptyEnv       : chr(0) 
##  - attr(*, "class")= chr [1:2] "SearchPathContent" "list"
m <- module({
  import("stats")

  functionWithDep <- function(x) median(x)
})
m$functionWithDep(1:10)
## [1] 5.5

Importing modules

To import other modules, the function use can be called. use really just means import module. With use we can load modules:

Consider the following example:

mm <- module({
  m <- use(m)
  anotherFunction <- function(x) m$functionWithDep(x)
})
mm$anotherFunction(1:10)
## [1] 5.5

To load modules from a file we can refer to the file directly:

module({
  m <- use("someFile.R")
  # ...
})

Exports

Modules can help to isolate code from the state of the global environment. Now we may have reduced the complexity in our global environment and moved it into a module. However, to make it very obvious which parts of a module should be used we can also define exports. Every non-exported object will not be accessible.

Properties of exports are:

m <- module({
  export("fun")

  fun <- identity # public
  privateFunction <- identity

  # .named are always private
  .privateFunction <- identity
})

m
## fun:
## function(x)

Example: Modules as Parallel Process

One example where you may want to have more control of the enclosing environment of a function is when you parallelize your code. First consider the case when a naive implementation fails.

library("parallel")
dependency <- identity
fun <- function(x) dependency(x)

cl <- makeCluster(2)
clusterMap(cl, fun, 1:2)
## Error in checkForRemoteErrors(val): 2 nodes produced errors; first error: could not find function "dependency"
stopCluster(cl)

To make the function fun self contained we can define it in a module.

m <- module({
  dependency <- identity
  fun <- function(x) dependency(x)
})

cl <- makeCluster(2)
clusterMap(cl, m$fun, 1:2)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
stopCluster(cl)

Note that the parallel computing facilities in R always provide a way to handle such situations. Here it is just a matter of organization if you believe the function itself should handle its dependencies or the parallel interface.