- an attendant in a hospital responsible for the non-medical care of patients and the maintenance of order and cleanliness.
- a soldier who carries orders or performs minor tasks for an officer.
orderly
is a package designed to help make analysis more
reproducible. Its principal aim is to automate a series of basic steps
in the process of writing analyses, making it easy to:
With orderly
we have two main hopes:
orderly
requires a few conventions around organisation
of a project, and after that tries to keep out of your way. However,
these requirements are designed to make collaborative development with
git easier by minimising conflicts and making backup easier by using an
append-only storage system.
One often-touted goal of R over point-and-click analyses packages is that if an analysis is scripted it is more reproducible. However, essentially all analyses depend on external resources - packages, data, code, and R itself; any change in these external resources might change the results. Preventing such changes in external resources is not always possible, but tracking changes should be straightforward - all we need to know is what is being used.
For example, while reproducible research has become
synonymous with literate programming this approach often increases
the number of external resources. A typical knitr
document will depend on:
.Rmd
or .Rnw
)source
The orderly
package helps by
The core problem is that analyses have no general interface.
Consider in contrast the role that functions take in programming. All
functions have a set of arguments (inputs) and a return value (outputs).
With orderly
, we borrow this idea, and each piece of
analysis will require that the user describes what is needed and what
will be produced.
The user describes the inputs of their analysis, including:
The user also provides a list of “artefacts” (file-based results) that they will produce.
Then orderly
:
It then stores metadata alongside the analysis including hashes of all inputs and outputs, copies of data extracted from the database, a record of all R packages loaded at the end of the session, and (if using git) information about the git state (hash, branch and status).
Then if one of the dependencies of a report changes (the used data, code, etc), we have metadata that can be queried to identify the likely source of the change.
orderly
In the MRC
Centre for Global Infectious Disease Analysis we use
orderly
on three major projects:
The workflows we have developed here are oriented towards
collaborative groups of researchers - other workflows are possible
(indeed orderly
is also designed to support a
decentralised workflow, though this has not been used in
practice yet).
In these projects we have a group of researchers who develop and test analyses locally. These are developed on a branch in git and then run on a centralised staging environment (a duplicate of our production environment). The code and outputs are reviewed with the help of GitHub’s “Pull requests” and then the reports are run on our production environment.
Interaction with the remote environments is achieved using an HTTP API which
orderly
itself transparently uses, so that reports can be
run remotely, directly
from R. The remote systems also include an interactive web interface
that can be used to explore and download versions of analyses, as well
as run new ones.
orderly
has a database, which should be the preferred
way of querying the report archive from other programs. The schema is
programmatically described at inst/database/schema.yml
and automatically generated database documentation is available here.
There is a set of regression tests that require the reference data.
Enable these by running the script ./scripts/copy_reference
which creates data in tests/testthat/reference
There are addins available to help with development workflows.
See docs at orderly.rstudio for setup and usage instructions.
Install orderly
from CRAN with
install.packages("orderly")
To install our internally released version (which might be ahead of CRAN) via drat, use
# install.packages("drat")
:::add("vimc")
dratinstall.packages("orderly")
MIT © Imperial College of Science, Technology and Medicine