R
code that would produce
the current plotAdhereR
is an R
package
that implements, in an open and standardized manner, various methods
linked to the estimation of adherence to treatment from a
variety of data sources and formats (please see the other vignettes in
the package, by involving, for example
browseVignettes(package="AdhereR")
or by visiting the
package’s site on CRAN). One of the
main aims of the package is to allow users to produce high quality,
publication-ready and highly customizable graphical
representations of both the patterns in the raw data and of the
various estimates of adherence. This can be normally achieved from an
R
session or script using the plot()
function
applied to an estimated CMA
object (the raw patterns are
plotted by creating a basic CMA0
object), as detailed in
the AdhereR:
Adherence to Medications vignette. However, while allowing for a
very fine-grained control over the resulting plots, this requires a
certain level of familiarity with R
(loading the source
data, creating the appropriate CMA
object, invoking the
plot()
function with the desired parameters, and the export
of the resulting plot in the desired format at the quality and with the
other desired characteristics), on the one hand, and the process is
rather cumbersome when the user wants to explore and understand the
data, or to try various types of plotting in search for the optimal
visualization, on the other.
These reasons prompted us to develop a fully interactive user
interface that should hide the “gory details” of data loading,
CMA
computation and plot()
invocation under an
intuitive and easy to use point-and-click interface, while allowing fast
exploration and customization of the plots. However, because this
interactive user interface covers a rather particular set of use cases,
tends to be rather “heavy” in terms of dependencies, and may not install
or run properly in some environments (e.g., headless servers or older
systems), we decided to implement it in a separate package,
AdhereRViz
that extends AdhereR
(i.e.,
AdhereRViz
requires AdhereR
, but
AdhereR
cn happily run without
AdhereRViz
).
We use Shiny
,
which allows us to build a self-contained app that can be run
locally or remotely inside a standard web browser (such as Firefox, Google Chrome, Safari, Internet
Explorer, Edge or
Opera) on multiple Operating
Systems (such as Microsoft Windows,
Apple’s [macOS]https://en.wikipedia.org/wiki/MacOS) and iOS, Google’s Android, and several flavors of
Linux – e.g., Debian, Ubuntu, Fedora, RedHat,
CentOS, Arch… – and BSD – e.g., FreeBSD) and devices
ranging from desktop and laptop computers to mobile phones and tablets.
The app’s interface uses standard controls and paradigms,
ensuring a similar user experience across browsers, platforms and
devices.
Locally, the app can be launched from a normal
R
session (including from within RStudio
) or
script with a single command; of course, the latest version of
AdhereR
and AdhereRViz
must be
installed on the system (using, for example,
install.packages("AdhereRViz", dep=TRUE)
or
RStudio
’s Tools → Install Packages… menu;
or, in case it is already installed, updated using
update.packages()
or RStudio
’s Tools
→ Check for Package Updates… menu), and loaded in the
current session (using, for example, library(AdhereRViz)
or
require(AdhereRViz)
). With these prerequisites in order,
the app can be launched without any parameters with
plot_interactive_cma()
or, if so desired, by specifying a data source and the important
column names and optionally the desired CMA
, as in the
following example, where we use CMA0
(i.e., the raw data)
from the sample dataset med.events
(see its structure in
the Table below) included with the AdhereR
package:
plot_interactive_cma(data=med.events, # included sample dataset
cma.class="simple", # simple cma, defaults to CMA0
# The important column names:
ID.colname="PATIENT_ID",
event.date.colname="DATE",
event.duration.colname="DURATION",
event.daily.dose.colname="PERDAY",
medication.class.colname="CATEGORY",
# The format of dates in the "DATE" column:
date.format="%m/%d/%Y");
PATIENT_ID | DATE | PERDAY | CATEGORY | DURATION |
---|---|---|---|---|
1 | 03/22/2035 | 2 | medB | 30 |
1 | 03/31/2035 | 2 | medB | 30 |
2 | 01/20/2036 | 4 | medA | 50 |
2 | 03/10/2036 | 4 | medA | 50 |
2 | 08/01/2036 | 4 | medA | 50 |
2 | 08/01/2036 | 4 | medB | 60 |
2 | 09/21/2036 | 4 | medB | 60 |
2 | 01/24/2037 | 4 | medB | 60 |
2 | 04/16/2037 | 4 | medB | 60 |
2 | 05/08/2037 | 4 | medB | 60 |
3 | 04/13/2042 | 4 | medA | 50 |
The app can also be launched in the “standard way” using
runApp()
or RStudio
’s ▶︎ Run app button.
Alternatively, the app may be made available on a remote server, such as on https://www.shinyapps.io/, in which case it can be accessed simply by pointing the web browser to the app’s internet address.
Please note that launching the App with no parameters, opens with a different screen (see the Selecting/changing the data source section for details).
Also note that there we provide a “stub” function
plot_interactive_cma()
in the package AdhereR
,
but this simply checks if AdhereRViz
is installed and
functional, and then tries to invoke plot_interactive_cma()
from AdhereRViz
.
The App’s UI has several main elements which can be seen below. Most UI elements have tooltips that show up on hovering the mouse over the element and that offer specific information (but please note that these tooltips might need some time before showing up).
Clicking on the About button (element
1 in the overview figure)
opens a box with info about the App, such as the version of the
AdhereR
package, and overview of the package and links to
where more help (such as vignettes) can be found.
It is recommended to cleanly exit the App by clicking the
Exit… button (element 2 in the overview figure), as simply closing the
browser will not normally also stop the R
process
running in the background. Please note that currently, exiting the App
will not also close the browser window or tab in which the App was
running…
The current plot is displayed in the UI element 3 (in the overview figure), a canvas that can be re-sized using the UI elements 4 in the overview figure (see also below) and which, when too big, can be scrolled horizontally and vertically at will. This canvas is currently passive in the sense that it simply displays a plot which which interaction is possible only using the other elements of the UI, but almost all aspects of this plot can be tweaked using controls from the left-hand side vertical panel (element 5 in the overview figure; see details below).
While the interpretation of these plots should be relatively intuitive, it is nevertheless detailed in the AdhereR: Adherence to Medications vignette.
Element 9 in the overview
figure displays most of the information
messages, warnings and errors generated during the plotting process
(please note that currently some messages, warnings and errors might not
be captured and only shown in the R
console). For example,
here, the informational message Plotting
patient ID ‘1’ with CMA ‘CMA9’ Plotting patient ID ‘2’ with CMA
‘CMA9’ means that the computation and plotting of
CMA9
for patients with IDs 1
and
2
was successful.
Elements 4 in the overview figure allow the control of the horizontal and vertical dimensions of the plot 3 either coupled (i.e., keeping the current width–to-height ratio), when the Keep ratio switch is ON, or independently of each other, when the switch is OFF (in which case a new slider controlling the plot height appears). The interaction with the slider(s) can be done either with mouse or with the arrow keys.
Please note that there is a minimum size requirement for a plot to be displayed, otherwise an error of the type
Plotting area is too small (it must be at least 10 x 0.5 characters per event, but now it is only 31.1 x 0.5)!
is thrown, in which case either the plotting area needs to be increased using the Plot width (and, if visible, Plot height) slider(s), or the number of patients or the duration to be shown need to be reduced. Alternatively, the Advanced section (see Section Setting parameters for details) can be used to decrease these minimum requirements (but this not recommended in most cases).
The current plot can be exported to a variety of formats by turning the Save plot! switch ON:
These new UI elements (10) allow:
By pressing the Save plot button, the user can select the location and file name (relative to the local machine) under which the plot will be exported.
R
code that
would produce the current plotWhile the main use scenarios for this App are built around
interactivity, the user may want to generate the same (or similar) plots
as the one currently displayed (in element 3 in the overview figure). To allow this, we provide
the Show R code… button (element 7 in
the overview figure), which opens a box with
the clearly commented R
code:
Clicking the Copy to clipboard button copies the
R
code to the clipboard, from where it can be pasted into
an editor of choice (such as RStudio
). In particular, for
the plot shown in the overview figure, the
R
code displayed is:
# The R code corresponding to the currently displayed Shiny plot:
#
# Extract the data for the selected 2 patient(s) with ID(s):
# "1", "2"
#
# We denote here by DATA the data you are using in the Shiny plot.
# This was manually defined as an object of class data.frame
# (or derived from it, such a data.table) that was already in
# memory under the name 'med.events'.
# Assuming this object still exists with the same name, then:
<- med.events;
DATA
# These data has 5 columns, and contains info for 100 patients.
#
# To allow using data from other sources than a "data.frame"
# and other similar structures (for example, from a remote SQL
# database), we use a metchanism to request the data for the
# selected patients that uses a function called
# "get.data.for.patients.fnc()" which you may have redefined
# to better suit your case (chances are, however, that you are
# using its default version appropriate to the data source);
# in any case, the following is its definition:
<- function(patientid, d, idcol, cols=NA, maxrows=NA) d[ d[[idcol]] %in% patientid, ]
get.data.for.patients.fnc # Try to extract the data only for the selected patient ID(s):
<- get.data.for.patients.fnc(
.data.for.selected.patients. c("1", "2"),
### don't forget to put here your REAL DATA! ###
DATA, "PATIENT_ID"
);# Compute the appropriate CMA:
<- CMA9(data=.data.for.selected.patients.,
cma # (please note that even if some parameters are
# not relevant for a particular CMA type, we
# nevertheless pass them as they will be ignored)
ID.colname="PATIENT_ID",
event.date.colname="DATE",
event.duration.colname="DURATION",
event.daily.dose.colname="PERDAY",
medication.class.colname="CATEGORY",
carry.only.for.same.medication=FALSE,
consider.dosage.change=FALSE,
followup.window.start=0,
followup.window.start.unit="days",
followup.window.duration=730,
followup.window.duration.unit="days",
observation.window.start=0,
observation.window.start.unit="days",
observation.window.duration=730,
observation.window.duration.unit="days",
date.format="%m/%d/%Y"
);
if( !is.null(cma) ) # if the CMA was computed ok
{# Try to plot it:
plot(cma,
# (same idea as for CMA: we send arguments even if
# they aren't used in a particular case)
align.all.patients=FALSE,
align.first.event.at.zero=FALSE,
show.legend=TRUE,
legend.x="right",
legend.y="bottom",
legend.bkg.opacity=0.5,
legend.cex=0.75,
legend.cex.title=1,
duration=NA,
show.period="days",
period.in.days=90,
bw.plot=FALSE,
col.na="#D3D3D3",
unspecified.category.label="drug",
col.cats=rainbow,
lty.event="solid",
lwd.event=2,
pch.start.event=15,
pch.end.event=16,
col.continuation="#000000",
lty.continuation="dotted",
lwd.continuation=1,
cex=1,
cex.axis=1,
cex.lab=1.25,
highlight.followup.window=TRUE,
followup.window.col="#00FF00",
highlight.observation.window=TRUE,
observation.window.col="#FFFF00",
observation.window.density=35,
observation.window.angle=-30,
observation.window.opacity=0.3,
show.real.obs.window.start=TRUE,
real.obs.window.density=35,
real.obs.window.angle=30,
print.CMA=TRUE,
CMA.cex=0.5,
plot.CMA=TRUE,
CMA.plot.ratio=0.1,
CMA.plot.col="#90EE90",
CMA.plot.border="#006400",
CMA.plot.bkg="#7FFFD4",
CMA.plot.text="#006400",
plot.CMA.as.histogram=TRUE,
show.event.intervals=TRUE,
print.dose=TRUE,
print.dose.outline.col="#FFFFFF",
print.dose.centered=FALSE,
plot.dose=FALSE,
lwd.event.max.dose=8,
plot.dose.lwd.across.medication.classes=FALSE,
min.plot.size.in.characters.horiz=10,
min.plot.size.in.characters.vert=0.5
); }
This code is pretty much ready to be run, except for some issues that
might surround accessing the actual data used for plotting: the user is
reminded of these through the yellow-on-red
bold italic highlighting of DATA
(not shown in the
code listing above). In a nutshell (for details, see below), if (a) the
user interactively uses the App to load or connect to a data source
(such as an external file or an SQL database), then the identity of this
data source is known (the file name or the database location), but if
(b) the data source was passed to the
plot_interactive_cma()
function as the data
argument, the App cannot know how this data source was named (and this
“name” might not even exist if, for example, the data was created
on-the-fly while calling the plot_interactive_cma()
function). Wickedly, even in case (a), it is generally unsafe to assume
that the data source will stay the same (or will be accessible in the
same way) in the future. Thus, while we provide as much info about the
data source used to produce the current plot as possible, we also warn
the user to be careful when running this code!
By switching the UI element 8 (in the overview figure) Compute CMA for
several patients… to ON, the user unlocks a
new set of UI elements that allow the computation of the currently
defined CMA
for more patients and the export of the results
to an external file.
First, it is important to highlight that the App is not
intended for heavy computations, which explains why we are currently
limiting this CMA computation to at most 100 patients,
at most 5000 events across all patients (if more
patients or events are selected, the computation will be done for only
the first 100 and 5000, respectively) and for at most 5
minutes of running time (after which the computation is
automatically stopped). If seriously heavy computation is needed, we
recommend the use of the appropriate CMA()
functions from
and R
session or script, which allow many types of parallel
processing and the use of several types of data sources with very
fine-grained control, as described in the vignettes AdhereR:
Adherence to Medications and Using
AdhereR with various database technologies for processing very large
datasets. The R
code needed to compute the current CMA
can be accessed through the Show R code button (UI
element 7).
The patients for which the computation of the CMA is to be performed can be done in two main ways:
These two ways of selecting patients should be flexible enough for
cover most cases of (semi-)interactive use; for more patients and/or the
selection of patients based on more complex criteria, we suggest the use
of the R
code in a script.
After patients have been selected, the user can press the Compute CMA button (UI element 12) to access a specialized dialog box (see figure below) where the CMA computation can be started, its progress monitored, or stopped, and from where the results can be exported to file.
The left-hand panel has two tabs: Params and Data, and we are focusing here on Params, which contains various parameters customizing the computed CMA and the plotting of the results. UI element 5 in the overview figure shows part of this panel, but the following principles apply:
CMA0
, CMA5-9
,
per episodes, and sliding windows).We will now go through all sections one by one.
The General settings section is always visible and allows the selection of:
CMA type: the type of CMA to compute, which can be (please see Dima & Dediu, 2017, and the vignette AdhereR: Adherence to Medications for more details):
CMA0
tot CMA9
,CMA to compute: the “simple” CMA to compute,
either by itself (for CMA type ==
simple) or iteratively (for the other two “complex”
types); please note that by definition CMA0
cannot be used
with “complex” CMAs (which explains why it cannot be selected in these
cases)
Patient(s) to plot: the list of patient IDs, selected from a drop-down list (which allows multiple selections) containing all the patient IDs in the current data source (at least one patient must be selected, otherwise an error is generated)
Depending on these selections the plot may change or various types of errors or warnings may be thrown.
These two sections are very similar and allow the definition of the follow-up (FUW) and observation (OW) windows by specifying:
their start: this can be either:
and their duration as a number of units (days, weeks, months or years).
This section is shown only for CMA5
to CMA9
and concerns the way carry over is considered:
This section is shown only for CMA per episodes and concerns the way treatment episodes are defined:
This section is shown only for sliding windows and concerns the way this sequence of regularly spaced and uniform sliding windows is defined:
SW start: when is the first sliding window starting (relative to the start of the OW) in terms of units (SW start unit) that can be days, weeks, months or years
how long is one such sliding window SW duration in terms of SW duration unit (days, weeks, months or years)
the step between two consecutive sliding windows can be defined either in terms of:
Plot CMA as histogram?: should the distribution of CMA estimates across sliding windows for a given participant be plotted as a histogram or as a barplot
This section is shown only if there’s more than one patient selected, and controls the way the plots of several patients are displayed vertically:
This section controls the amount of temporal information displayed (on the horizontal axis):
0
it
is automatically computed so as all the events in the plot are
shownThis section is shown for all CMAs except CMA0
and
controls how the CMA estimates are to be shown on the plot:
This section is shown only a daily dose column is defined
for the current data source, and only for CMA0
,
CMA5
–CMA9
, per episodes, and
sliding windows (CMA1
–CMA4
by
definition are unaware of dose and treatment categories) and controls
how the dose is visually shown (if at all):
Print it?: print the dosage (i.e., the actual numeric values) next to each event; if so:
As line width?: show the dose as the event line width; if so:
Please see the overview figure for an example where the dose is printed.
This section controls the visual appearance of the legend:
Show legend?: should the legend be shown at all; if so:
This section controls many aspects of the visual presentation of the plots, including colors, font sizes and line styles; some of these depend on other factors, so may or may not be visible:
CMA0
, per episode, and sliding windows): Cont. line
color, Cont. line style and Cont. line
widthCMA0
)CMA8
uses a “real observation window”, which ca be
shown or not (Show real OW?) and whose attributes are
the line density (Real OW hash dens) and angle
(Real OW hash angle)CMA0
), we can control various
attributes of the CMA estimate: the relative font size (CMA font
size), the percent of the plotting area dedicated to plotting
it (CMA plot area %), its color (CMA plot
color), border color (CMA border color),
background color (CMA bkg. color) and text color
(CMA text color)This section controls several advanced settings:
If the interactive App was started with a given data source
passed through the parameters to the
plot_interactive_cma(...)
arguments, this data source (if
valid and well-defined) is automatically used, but it can be changed at
any time (as described below). However, if the App was starting
without any data source (i.e.,
plot_interactive_cma()
), the user is forced to select a
valid data source before being able to plot anything. The actual
processes of selecting an initial data source or changing it later are
identical, so we discuss here the case of no initial data source: when
plot_interactive_cma()
was invoked, the App is opened
without any plotting and messaging area at all and the
Data tab in the left-hand panel is automatically
selected and the Params tab contains only a warning
message:
The Data panel allows us to interactively select and change on-the-fly the data set to be used; currently, this can be:
data.frame
(this includes, thus, things such
as data.tables
) that is already in the global
environment of the current R
session (please see for
example, here,
for details about environments, but our purposes here, this contains the
“stuff” currently loaded in R
’s memory),The type of data source can be done with with list Datasource type at the top of the Data panel. We will go now through each of these types of data sources in turn.
This is, in some respects, the simplest type of data source. When selected, the tab looks like:
The In-memory dataset UI element contains the list
of all objects in the current global environment derived from
data.table
that have at last 1 row and 3 columns, and they
can be selected simply by clicking on their name (here, we select the
med.events
example dataset):
After clicking on it, the dataset is selected and optionally available for inspection using the Peek at dataset button:
If the dataset is not the desired one, it can be replaced with anything else using the interface, but, if it is the one, we can continue by selecting the important columns and the format of the dates.
Please note that, at this time, the dataset is not selected to be used for plotting: this is an explicit action done by pressing the Validate & use! button at the bottom, which does perform various checks (such as that each column is used at most once and that the types more or less fit the expected type and format, among others) and, if OK, makes this dataset the one to be used for plotting.
This is a very useful case, where the data is stored in an external file. When selecting load from file in the Datasource type list, the panel becomes:
The App supports loading data from several file formats:
Comma/TAB-separated (.csv; .tsv; .txt): this is the default format and refers to a class of open and flexible file formats where tabular data is stored as rows of values separated by a pre-defined delimiter; the best known are Comma-Separated Values (CSV) and TAB-Separated Values (TSV) formats, but the App allows a lot of flexibility by defining:
TAB
] character (\t), the comma (,), one
or more whitespaces, the semicolon (;) or the
colon (:)Serialized R object (.rds): this loads data
(such as objects derived from data.frame
) previously
exported from R
using readRDS()
(usually as
“.rds”)
Open Document Spreadsheet (.ods) and
Microsoft Excel (.xls; .xlsx) loads data from these
widespread formats, used by office suites such as LibreOffice/OpenOffice’s Calc
and Micrsoft Office’s
Excel
programs, among others; for both these formats, the
user can specify the particular sheet to be loaded for files containing
more than one (Which sheet to load?)
SPSS (.sav; .por), SAS Transport data
file (.xpt), SAS sas7bdat data file
(.sas7bdat) and Stata (.dta): these are file
formats exported by the popular statistical platforms IBM
SPSS
, SAS
and Stata
Please note that while Comma/TAB-separated (.csv; .tsv; .txt), Serialized R object (.rds) and Open Document Spreadsheet (.ods) should be imported without issues, for the others there might limitations and fringe cases.
After the file format has been selected, the user can use the Load from file control (its Selecte button) to browse for the desired file and upload it. Basic checks might be performed and a file might be rejected, but if the loading was successful, a new set of UI elements becomes visible. These elements are virtually identical to those used for in-memory datasets.
This allows the access to data stored in standard Relational
Database Management Systems (RDBMS’s) which use the Structured Query Language
(SQL) – for more info about the facilities offered by
AdhereR
, please see the Using
AdhereR with various database technologies for processing very large
datasets vignette.
Currently, the App supports SQLite
, a small
engine designed to be embedded in larger applications and which stored
the data in normal files, and MySQL
/MariaDB
, which are
widely-used, full-featured free and open-source RDBMSs.
While SQLite is intended only as a demo of the App’s
capabilities and uses an in-memory database with a single table that
contains a verbatim copy of the med.events
example dataset,
MySQL/MariaDB allows the use of actual databases, local
or over the internet. Except for the selection of the database, the UI
for the two is identical, so we will only discuss here the
MySQL/MariaDB case
We can connect to a local or remote server, and we can (optionally) define the following:
When clicking the Connect! button, the App attempts to connect the server, authenticate and access the desired database: if everything’s OK, it fetches basic information over all the tables/views in the database (avoiding thus unnecessary traffic) and displays it:
New UI elements become visible, most being virtually identical to those used for loading from file and in-memory datasets, but the specific ones being:
For example, when using the MySQL
database described in
the vignette Using
AdhereR with various database technologies for processing very large
datasets in package AdhereR
, we can create, for
example, a view
(named testview
) that brings
together these data in the needed format with the following
SQL
commands in MySQL Workbench
:
USE med_events;
CREATE VIEW `testview` AS
SELECT patients.`id`, `date`, `category`, `duration`, `perday`
FROM event_date
JOIN event_info
ON event_info.`id` = event_date.`id`
JOIN event_patients
ON event_patients.`id` = event_info.`id`
JOIN patients
ON patients.`id` = event_patients.`patient_id`;
Since AdhereRViz
version 0.2/AdhereR
version 0.7, medication groups can be defined and used for
interactive plotting. For ore details about medication groups, please
see the vignette “AdhereR: Adherence to Medications” in package
AdhereR
, but, fundamentally, they are named vectors of
characters where the names are the unique names of groups of medication
defined using R
-like expressions that describe with of the
events in the dataset are covered by a given medication group.
Alternatively, one can use a column in the data itself to
define the medication groups.
Medication groups can be passed with the
medication.groups
argument to the
plot_interactive_cma()
function, or can be interactively
loaded through a new panel Groups
of the user interface
(please note that a valid dataset must have already been loaded):
Pressing the “Use it!” button load the selected medication group definitions; please make sure they are correct and fit the currently loaded dataset! If the loading goes well, the plotting is automatically updated to reflect the medication groups:
The main differences to a plot without medication groups are:
These apply not only to simple CMAs, but also to sliding windows and per episode (not shown).
Various options related to plotting the medication groups can be changed in the using the new “Medication groups” user interface, probably the most important being the medication groups to be included in the plot:
Dima A.L., Dediu D. (2017) Computation of adherence to medication and visualization of medication histories in R with AdhereR: Towards transparent and reproducible use of electronic healthcare data. PLoS ONE 12(4): e0174426. doi:10.1371/journal.pone.0174426.
Please note that all font sizes are relative.
Thus a font size of 1.0
means the default font size used
for the plot (depending on resolution, etc.), while a value of
0.50
means half that and 1.25
means 25%
bigger.↩︎
We have decided against directly mapping each class to a particular color and, instead, automatically mapping them using a palette, because this accommodates more flexibly a varying number (or grouping) of classes; the mapping classes → colors is based on the alphabetic order of the class names.↩︎