library(rstan)
#> Loading required package: StanHeaders
#>
#> rstan version 2.26.22 (Stan version 2.26.1)
#> For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores()).
#> To avoid recompilation of unchanged Stan programs, we recommend calling
#> rstan_options(auto_write = TRUE)
#> For within-chain threading using `reduce_sum()` or `map_rect()` Stan functions,
#> change `threads_per_chain` option:
#> rstan_options(threads_per_chain = 1)
#> Do not specify '-march=native' in 'LOCAL_CPPFLAGS' or a Makevars file
library(bennu)
library(bayesplot)
#> This is bayesplot version 1.10.0
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#> * Does _not_ affect other ggplot2 plots
#> * See ?bayesplot_theme_set for details on theme setting
library(ggplot2)
rstan_options(auto_write = TRUE)
options(mc.cores = 2)
## basic example code
d <- generate_model_data()
#> Warning: `generate_model_data()` was deprecated in bennu 0.2.1.
#> ℹ Please use `model_random_walk_data()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
# note iter should be at least 2000 to generate a reasonable posterior sample
fit <- est_naloxone(d,iter=100)
#> Warning: Specifying the `id_cols` argument by position was deprecated in tidyr 1.3.0.
#> ℹ Please explicitly name `id_cols`, like `id_cols = regions`.
#> ℹ The deprecated feature was likely used in the bennu package.
#> Please report the issue at <https://github.com/sempwn/bennu/issues>.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
#> Warning: There were 15 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: There were 2 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is NA, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
mcmc_pairs(fit, pars = c("sigma","mu0","zeta"),
off_diag_args = list(size = 1, alpha = 0.5))
Return median and 90th percentiles for posterior samples
library(posterior)
#> This is posterior version 1.4.1
#>
#> Attaching package: 'posterior'
#> The following object is masked from 'package:bayesplot':
#>
#> rhat
#> The following objects are masked from 'package:rstan':
#>
#> ess_bulk, ess_tail
#> The following objects are masked from 'package:stats':
#>
#> mad, sd, var
#> The following objects are masked from 'package:base':
#>
#> %in%, match
summarise_draws(fit, default_summary_measures())
#> # A tibble: 462 × 7
#> variable mean median sd mad q5 q95
#> <chr> <num> <num> <num> <num> <num> <num>
#> 1 logp[1] -1.84 -1.76 0.461 0.427 -2.58 -1.17
#> 2 logp[2] -1.87 -1.81 0.414 0.397 -2.54 -1.28
#> 3 logp[3] -1.89 -1.85 0.319 0.312 -2.54 -1.45
#> 4 logp[4] -1.85 -1.83 0.276 0.251 -2.33 -1.44
#> 5 logp[5] -1.58 -1.57 0.263 0.252 -1.99 -1.15
#> 6 logp[6] -1.34 -1.35 0.150 0.159 -1.56 -1.09
#> 7 logp[7] -1.24 -1.24 0.204 0.208 -1.57 -0.921
#> 8 logp[8] -1.18 -1.18 0.138 0.132 -1.38 -0.950
#> 9 logp[9] -1.20 -1.20 0.126 0.145 -1.40 -1.00
#> 10 logp[10] -1.31 -1.33 0.169 0.181 -1.58 -1.06
#> # ℹ 452 more rows
We can compare two different model fits of the probability of prior
kit use to simulated data using the plot_kit_use
function
We can compare the posterior predictive check to the prior predictive
check by re-running the model without the likelihood by setting the
run_estimation
parameter to FALSE
# note iter should be at least 2000 to generate a reasonable posterior sample
prior_fit <- est_naloxone(d,
run_estimation = FALSE,
iter=100)
#> Warning: There were 16 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: There were 4 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
#> https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
#> Warning: There were 1 chains where the estimated Bayesian Fraction of Missing Information was low. See
#> https://mc-stan.org/misc/warnings.html#bfmi-low
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is NA, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
mcmc_pairs(prior_fit, pars = c("sigma","mu0","zeta"),
off_diag_args = list(size = 1, alpha = 0.5))
We can compare the prior and posterior predictive distribution of the
probability of prior kit use using the plot_kit_use
function
The second model uses a random walk of order 1 to characterize the time-dependence structure of the inverse logit probability of naloxone use,
\[\Delta^2c_i = c_i - c_{i+1} \sim N(0,\tau^{-1}) \]
The model can also be fit using a random walk of order 2 increments for the inverse logit probability of naloxone use using the following independent second order increments,
\[\Delta^2c_i = c_i - 2c_{i+1} + c_{i+2} \sim N(0,\tau^{-1}) \]
This version of the model will produce smoother estimates of probability of naloxone use in time.
# note iter should be at least 2000 to generate a reasonable posterior sample
# note use `rw_type` to specify order of random walk
rw2_fit <- est_naloxone(d,
rw_type = 2,
iter=100)
#> Warning: There were 103 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is NA, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
mcmc_pairs(rw2_fit, pars = c("sigma","mu0","zeta"),
off_diag_args = list(size = 1, alpha = 0.5))
Summarize random walk order 2 model draws
summarise_draws(rw2_fit, default_convergence_measures())
#> # A tibble: 462 × 4
#> variable rhat ess_bulk ess_tail
#> <chr> <num> <num> <num>
#> 1 logp[1] 2.29 12.5 20.1
#> 2 logp[2] 1.90 13.1 26.5
#> 3 logp[3] 1.88 10.5 24.2
#> 4 logp[4] 1.78 21.0 107.
#> 5 logp[5] 1.29 13.9 76.6
#> 6 logp[6] 1.21 17.4 102.
#> 7 logp[7] 1.33 16.1 89.6
#> 8 logp[8] 1.64 37.0 52.0
#> 9 logp[9] 1.93 27.7 142.
#> 10 logp[10] 2.01 14.5 50.8
#> # ℹ 452 more rows
We can compare two different model fits of the probability of prior
kit use to simulated data using the plot_kit_use
function
As distribution data are reported from sites some months may not be reported. This would lead to a difference in the frequency of the orders data and the distribution data. The model can account for these differences and infer the distribution during missing months. See the following example for generating distribution data once every three months
## basic example code
d_missing <- generate_model_data(reporting_freq = 3)
ggplot(aes(x=times,y=Reported_Used,color=as.factor(regions)),data=d_missing) +
geom_point()
#> Warning: Removed 32 rows containing missing values (`geom_point()`).
missing_fit <- est_naloxone(d_missing,
rw_type = 2,
iter=100)
#> Warning: There were 53 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is NA, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
mcmc_pairs(missing_fit, pars = c("sigma","mu0","zeta"),
off_diag_args = list(size = 1, alpha = 0.5))
the original model was designed for data aggregated into months. If the order and distribution data is aggregated differently then this can be incorporated into the model by updating the delay in reporting distribution and the delay in ordering to distributing distribution.
The delay in ordering to distributing distribution is composed of a gamma distribution with shape parameter \(\alpha\) and inverse scale parameter \(\beta = 1/\theta\). In order to update for example from monthly to weekly data, we can use the following property of the gamma distribution,
\[\Gamma(cx;\alpha,\beta) = \Gamma(x;\alpha,\beta/c)\]
To convert the delay distribution from month into week set \(c = 1/4\) as one over the approximate number of weeks in a month. The reporting delay distribution is a simple vector of length 3 denoting the empirical delay distribution. This can be expanded into months through interpolation and normalization to retain its property as a probability distribution. Example code (not run) is shown below. Note that these distributions will need to be updated when considering another jurisdiction,
# monthly reporting delay distribution
psi_vec <- c(0.7, 0.2, 0.1)
# convert to weeks using interpolation
weekly_psi_vec <- rep(psi_vec,4) / sum(psi_vec)
# properties of order to distributed delay distribution in months
max_delays <- 3
delay_alpha <- 2
delay_beta <- 1
# convert to weeks
weekly_max_delays <- max_delays*4
weekly_delay_alpha <- delay_alpha
weekly_delay_beta <- 0.25 * delay_beta
result <- est_naloxone(d,
psi_vec = weekly_psi_vec,
max_delays = weekly_max_delays,
delay_alpha = weekly_delay_alpha,
delay_beta = weekly_delay_beta)