Why?
Medical device event data are messy.
Common challenges include:
How?
The mds
package provides a standardized framework to address these challenges:
R
files for auditability, documentation, and reproducibilityPurpose of This Vignette
mds
mds
functions: deviceevent(), exposure(), define_analyses(), time_series()Note on Statistical Algorithms
mds
data and analysis standards allow for seamless application of various statistical trending algorithms via the mdsstat
package (under development).
Our example dataset maude
was queried from the FDA MAUDE API and contains 535 reported events on bone cement in 2017. Furthermore, a simulated exposure dataset sales
was generated to provide denominator data for our bone cement events.
head(maude, 3)
report_number | event_type | date_received | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name | device_name | medical_specialty_description | device_class | region |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0002249697-2017-00023 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central | |
0002249697-2017-00028 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | West | |
0002249697-2017-00025 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central |
head(sales, 3)
device_name | region | sales_month | sales_volume |
---|---|---|---|
Arthroscope | Central | 2017-01-01 | 83 |
Arthroscope | Central | 2017-02-01 | 119 |
Arthroscope | Central | 2017-03-01 | 112 |
The general workflow to go from data to trending over time is as follows:
deviceevent()
to standardize device-event data.exposure()
to standardize exposure data (optional).define_analyses()
to enumerate possible analysis combinations.time_series()
to generate counts (and/or rates) by time based on your defined analyses.# Step 1 - Device Events
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")
# Step 2 - Exposures (Optional step)
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")
# Step 3 - Define Analyses
da <- define_analyses(
de,
device_level="device_name",
exposure=ex,
covariates="region")
# Step 4 - Time Series
ts <- time_series(
da,
deviceevents=de,
exposure=ex)
You may:
de
, ex
), analyses (da
), and time series (ts
) for documentationsummary()
and define_analyses_dataframe()
plot()
your time series (plotting options)mdsstat
package)summary(da)
#> $`Analyses Timestamp`
#> [1] "2020-06-14 21:45:56 EDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01
head(dadf, 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceevent()
to Standardize Device-Event DataBasic Usage
de <- deviceevent(maude, "date_received", c("device_name", "device_class"), c("event_type", "medical_specialty_description"))
head(de, 3)
key | time | device_1 | device_2 | event_1 | event_2 |
---|---|---|---|---|---|
1 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
2 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
3 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
Advanced Usage
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")
head(de, 3)
key | time | device_1 | device_2 | event_1 | event_2 | region | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0002249697-2017-00023 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
0002249697-2017-00028 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | West | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
0002249697-2017-00025 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK |
data_frame
time
Date
format.
device_hierarchy
mds
remembers this hierarchy and allows trending at multiple levels as you specify.
event_hierarchy
descriptors
argument. The hierarchical concept reflects how events are often nested into progressively more general groups. Set the first variable as the lowest event level that you would like to trend at. mds
remembers this hierarchy and allows trending at multiple levels as you specify. If your data does not have an event variable, you will need to create a dummy variable.
key
data_frame
. If your data pipeline carries over a key variable, it is recommended to specify it here. The key
allows downstream aggregated analysis to be able to “look up” individual constituent events.
covariates
covariates="Region"
will allow analysis of regions within device. These variables should be categorical in nature.
descriptors
implant_days
exposure()
to Standardize Exposure DataExposure data is meant to support device-event data. As such, the general expectation is that variable values match between exposure and device-event data. For example, 10 exposures for ev3 Solitaire
in France
will be matched exactly to ev3 Solitaire
events in France
, and not to events for EV3 SOLITAIRE
in FRANCE
.
Basic Usage
head(ex, 3)
key | time | count | device_1 |
---|---|---|---|
1 | 2017-01-01 | 1 | Arthroscope |
2 | 2017-02-01 | 1 | Arthroscope |
3 | 2017-03-01 | 1 | Arthroscope |
Advanced Usage
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")
head(ex, 3)
key | time | count | device_1 | region |
---|---|---|---|---|
1 | 2017-01-01 | 83 | Arthroscope | Central |
2 | 2017-02-01 | 119 | Arthroscope | Central |
3 | 2017-03-01 | 112 | Arthroscope | Central |
Note: Although not required, count
will commonly be used as well.
data_frame
time
Date
format. If exposure will be used, it is critical to have sufficient time granularity. For example, if analysis will be done monthly, exposure data must be no less granular than monthly. mds
does not make assumptions about filling in holes in time!
device_hierarchy
device_hierarchy
parameter.
event_hierarchy
event_hierarchy
parameter. Exposures at an event level is not common.
count
key
data_frame
. If your data pipeline carries over a key variable, it is recommended to specify it here. The key
allows downstream aggregated analysis to be able to “look up” individual constituent exposure records.
match_levels
define_analyses()
to Enumerate Analysis CombinationsAfter standardizing device-event data using deviceevent()
and, optionally, exposure data using exposure()
, the next step is to discover what types of analyses are possible. This is separated from actually doing the analysis (counting, calculations, statistics, etc.) because:
Basic Usage
Note that define_analyses()
returns a list of individual analyses. Each individual analysis contains a set of instructions. You can view an analysis by submitting da[[1]]
, da[[2]]
, etc., but a less cumbersome overview is possible using summary()
and define_analyses_dataframe()
.
summary(da)
#> $`Analyses Timestamp`
#> [1] "2020-06-14 21:45:58 EDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 7 0 6
#> Event Levels Covariates
#> 1 1
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure <NA> <NA>
#> 3 Both 2017-01-01 2017-12-01
head(define_analyses_dataframe(da), 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement, Antibiotic | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
Advanced Usage
summary(da)
#> $`Analyses Timestamp`
#> [1] "2020-06-14 21:45:59 EDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01
head(define_analyses_dataframe(da), 3)
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceevents
class()
should contain "mds_de"
)
device_level
attributes(de)$device_hierarchy
.
event_level
attributes(de)$event_hierarchy
.
exposure
class()
should contain "mde_e"
)
date_level
and date_level_n
"months"
and 1
analyzes by month. Other examples include "months"
and 12
for yearly, or "days"
and 7
for weekly.
covariates
c("region")
analyzes by each level of region
within device.
times_to_calc
date_level
and date_level_n
.
It is always assumed that analyses at aggregated levels are desired. (such as analysis of all events for a given device, or analysis of all events across all devices)
Aggregated level analysis is easily recognized by the "All"
and "Data"
values in device_level
, event_level
, covariate
, and covariate_level
.
id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11 | 11 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-08-01 | Cement, Bone, Vertebroplasty | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-08-01 |
12 | 12 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | Central | FALSE | 2017-02-01 | 2017-12-01 | Cement, Bone, Vertebroplasty | Central | 2017-01-01 | 2017-12-01 | 2017-02-01 | 2017-12-01 |
NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
There are several options:
da[[c(1:5, 24:27)]])
)define_analyses()
with different parameter settingsda[[1]]$date_range_exposure['start'] <- as.Date("2016-10-01")
)time_series()
to Generate Counts, Rates, and MoreOnce an analysis has been defined using define_analyses()
, the analyses instructions can be executed using time_series()
, returning by defined time periods:
key
parameter from deviceevent()
) for lookup of individual event records.key
parameter from exposure()
) for lookup of individual exposure records.Basic Usage
Note that time_series()
returns, in a list, one time series data frame for every analysis. You can select a time series by submitting ts[[1]]
, ts[[2]]
, etc.
head(ts[[1]], 3)
time | nA | ids |
---|---|---|
17167 | 13 | 0002249697-2017-00023 |
17198 | 7 | 0002249697-2017-00488 |
17226 | 5 | 0002249697-2017-00755 |
Advanced Usage
head(ts[[1]], 3)
time | nA | ids | exposure | ids_exposure |
---|---|---|---|---|
17167 | 13 | 0002249697-2017-00023 | 8597 | 37 |
17198 | 7 | 0002249697-2017-00488 | 5115 | 38 |
17226 | 5 | 0002249697-2017-00755 | 10191 | 39 |
analysis
class()
should contain "mds_da"
) or a list of defined analysis.
deviceevents
class()
contains "mds_de"
). It is typically the same data frame used to generate analysis
, but can be another "mds_de"
data frame, such as a cut of the data at a different time. Note if, say, an older dataset is being used, the analysis
date ranges must correspond.
exposure
class()
contains "mds_e"
). It is typically the same data frame used to generate analysis
. Like deviceevents
, another data frame may be used, but the analysis
instructions must correspond.
use_hierarchy
?time_series.mds_da
for more details.
It is not uncommon to adjust event and exposure counts, such as with applications of rolling or moving averages. These adjustments should be applied after generating time series data frames from time_series()
.
plot()
ing a Time SeriesPlotting an individual time series generated by time_series()
is simple. Simply call plot()
on the time series object:
There are a few custom parameters, including:
mode
"nA"
(representing the device-event of interest), "exposure"
, and "rate"
(simply "nA"/"exposure"
). Less common are "nB"
, "nC"
, and "nD"
representing the cell counts of the disproportionality analysis (DPA) contingency table.
xlab
, ylab
, main
plot()
behavior. By default, axes and title labels are inferred directly from the time series.
All other parameters are from plot.default()
.