library(knitr)
library(data.table)
#> data.table 1.14.2 using 24 threads (see ?getDTthreads). Latest news: r-datatable.com
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.17.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#>
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#>
#> ar
library(brmsmargins)
This vignette provides a brief overview of how to calculate marginal
effects for Bayesian location scale regression models, involving fixed
effects only or mixed effects (i.e., fixed and random) and fit using the
brms
package.
A simpler introduction and very brief overview and motivation for marginal effects is available in the vignette for fixed effects only.
This vignette will focus on Gaussian location scale models fit with
brms
. Gaussian location scale models in brms
have two distributional parameters (dpar):
Location scale models allow things like assumptions of homogeneity of variance to be relaxed. In repeated measures data, random effects for the scale allow calculating and predicting intraindividual variability (IIV).
To start with, we will look at a fixed effects only location scale model. We will simulate a dataset.
d <- withr::with_seed(
seed = 12345, code = {
nObs <- 1000L
d <- data.table(
grp = rep(0:1, each = nObs / 2L),
x = rnorm(nObs, mean = 0, sd = 0.25))
d[, y := rnorm(nObs,
mean = x + grp,
sd = exp(1 + x + grp))]
copy(d)
})
ls.fe <- brm(bf(
y ~ 1 + x + grp,
sigma ~ 1 + x + grp),
family = "gaussian",
data = d, seed = 1234,
silent = 2, refresh = 0,
chains = 4L, cores = 4L, backend = "cmdstanr")
#> Compiling Stan program...
summary(ls.fe)
#> Family: gaussian
#> Links: mu = identity; sigma = log
#> Formula: y ~ 1 + x + grp
#> sigma ~ 1 + x + grp
#> Data: d (Number of observations: 1000)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept -0.09 0.12 -0.33 0.15 1.00 4829 3553
#> sigma_Intercept 1.01 0.03 0.94 1.07 1.00 4696 3023
#> x 1.62 0.45 0.75 2.49 1.00 4470 2830
#> grp 1.02 0.35 0.34 1.69 1.00 2526 2684
#> sigma_x 0.85 0.09 0.67 1.02 1.00 4878 3309
#> sigma_grp 1.01 0.05 0.92 1.09 1.00 4425 2871
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
Now we can use brmsmargins()
. By default, it will be for
the location parameter, the mean. As this is a Gaussian linear model
with no transformations and not interactions, the AMEs are the same as
the regression coefficients.
Here is an example continuous AME.
h <- .001
ame1 <- brmsmargins(
ls.fe,
add = data.frame(x = c(0, h)),
contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
CI = 0.95, CIType = "ETI",
effects = "fixedonly")
knitr::kable(ame1$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
1.623 | 1.631 | 0.746 | 2.492 | NA | NA | 0.95 | ETI | NA | NA | AME x |
Here is an AME for discrete / categorical predictors.
ame2 <- brmsmargins(
ls.fe,
at = data.frame(grp = c(0, 1)),
contrasts = cbind("AME grp" = c(-1, 1)),
CI = 0.95, CIType = "ETI",
effects = "fixedonly")
knitr::kable(ame2$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
1.016 | 1.02 | 0.343 | 1.69 | NA | NA | 0.95 | ETI | NA | NA | AME grp |
In brms
the scale parameter for Gaussian models,
sigma
uses a log link function. Therefore when back
transformed to the original scale, the AMEs will not be the same as the
regression coefficients which are on the link scale (log
transformed).
We specify that we want AMEs for sigma
by setting:
dpar = "sigma"
. Here is a continuous example.
h <- .001
ame3 <- brmsmargins(
ls.fe,
add = data.frame(x = c(0, h)),
contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
CI = 0.95, CIType = "ETI", dpar = "sigma",
effects = "fixedonly")
knitr::kable(ame3$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
4.463 | 4.456 | 3.488 | 5.442 | NA | NA | 0.95 | ETI | NA | NA | AME x |
Here is a discrete / categorical example.
ame4 <- brmsmargins(
ls.fe,
at = data.frame(grp = c(0, 1)),
contrasts = cbind("AME grp" = c(-1, 1)),
CI = 0.95, CIType = "ETI", dpar = "sigma",
effects = "fixedonly")
knitr::kable(ame4$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
4.907 | 4.905 | 4.409 | 5.436 | NA | NA | 0.95 | ETI | NA | NA | AME grp |
These results are comparable to the mean difference in standard
deviation by grp
. Note that in general, these may not
closely align. However, in this instance as x
and
grp
were simulated to be uncorrelated, the simple
unadjusted results match the regression results closely.
d[, .(SD = sd(y)), by = grp][, diff(SD)]
[1] 4.976021
We will simulate some multilevel location scale data for model and fit the mixed effects location scale model.
dmixed <- withr::with_seed(
seed = 12345, code = {
nGroups <- 100
nObs <- 20
theta.location <- matrix(rnorm(nGroups * 2), nrow = nGroups, ncol = 2)
theta.location[, 1] <- theta.location[, 1] - mean(theta.location[, 1])
theta.location[, 2] <- theta.location[, 2] - mean(theta.location[, 2])
theta.location[, 1] <- theta.location[, 1] / sd(theta.location[, 1])
theta.location[, 2] <- theta.location[, 2] / sd(theta.location[, 2])
theta.location <- theta.location %*% chol(matrix(c(1.5, -.25, -.25, .5^2), 2))
theta.location[, 1] <- theta.location[, 1] - 2.5
theta.location[, 2] <- theta.location[, 2] + 1
dmixed <- data.table(
x = rep(rep(0:1, each = nObs / 2), times = nGroups))
dmixed[, ID := rep(seq_len(nGroups), each = nObs)]
for (i in seq_len(nGroups)) {
dmixed[ID == i, y := rnorm(
n = nObs,
mean = theta.location[i, 1] + theta.location[i, 2] * x,
sd = exp(1 + theta.location[i, 1] + theta.location[i, 2] * x))
]
}
copy(dmixed)
})
ls.me <- brm(bf(
y ~ 1 + x + (1 + x | ID),
sigma ~ 1 + x + (1 + x | ID)),
family = "gaussian",
data = dmixed, seed = 1234,
silent = 2, refresh = 0,
chains = 4L, cores = 4L, backend = "cmdstanr")
#> Compiling Stan program...
#> Warning: 102 of 4000 (3.0%) transitions hit the maximum treedepth limit of 10.
#> See https://mc-stan.org/misc/warnings for details.
Note that this model has not achieved good convergence, but as it already took about 6 minutes to run, for the sake of demonstration we continue. In practice, one would want to make adjustments to ensure good convergence and an adequate effective sample size.
summary(ls.me)
#> Family: gaussian
#> Links: mu = identity; sigma = log
#> Formula: y ~ 1 + x + (1 + x | ID)
#> sigma ~ 1 + x + (1 + x | ID)
#> Data: dmixed (Number of observations: 2000)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Group-Level Effects:
#> ~ID (Number of levels: 100)
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept) 1.19 0.09 1.03 1.37 1.02 211
#> sd(x) 0.42 0.04 0.35 0.51 1.00 862
#> sd(sigma_Intercept) 1.27 0.09 1.11 1.47 1.01 536
#> sd(sigma_x) 0.50 0.05 0.41 0.61 1.00 1569
#> cor(Intercept,x) -0.40 0.12 -0.61 -0.16 1.00 577
#> cor(sigma_Intercept,sigma_x) -0.36 0.11 -0.55 -0.13 1.00 1516
#> Tail_ESS
#> sd(Intercept) 519
#> sd(x) 1634
#> sd(sigma_Intercept) 819
#> sd(sigma_x) 2418
#> cor(Intercept,x) 1159
#> cor(sigma_Intercept,sigma_x) 2200
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept -2.55 0.12 -2.77 -2.33 1.03 105 300
#> sigma_Intercept -1.48 0.13 -1.72 -1.22 1.02 217 388
#> x 0.94 0.06 0.83 1.06 1.00 572 1550
#> sigma_x 0.97 0.06 0.85 1.09 1.00 1031 1892
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
We use brmsmargins()
similar as for other mixed effects
models. For more details see the vignette on marginal effects for mixed
effects models.
Here is an example treating x
as continuous using only
the fixed effects for the AME for the scale parameter,
sigma
.
h <- .001
ame1a.lsme <- brmsmargins(
ls.me,
add = data.frame(x = c(0, h)),
contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
dpar = "sigma",
effects = "fixedonly")
knitr::kable(ame1a.lsme$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
0.408 | 0.404 | 0.283 | 0.555 | NA | NA | 0.99 | HDI | NA | NA | AME x |
Here is the example again, this time integrating out the random effects, which results in a considerable difference in the estimate of the AME.
h <- .001
ame1b.lsme <- brmsmargins(
ls.me,
add = data.frame(x = c(0, h)),
contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
dpar = "sigma",
effects = "integrateoutRE", k = 100L, seed = 1234)
knitr::kable(ame1b.lsme$ContrastSummary, digits = 3)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
0.804 | 0.766 | 0.391 | 1.575 | NA | NA | 0.99 | HDI | NA | NA | AME x |
Here is an example treating x
as discrete, using only
the fixed effects.
ame2a.lsme <- brmsmargins(
ls.me,
at = data.frame(x = c(0, 1)),
contrasts = cbind("AME x" = c(-1, 1)),
dpar = "sigma",
effects = "fixedonly")
knitr::kable(ame2a.lsme$ContrastSummary)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
0.3782473 | 0.3750965 | 0.2673526 | 0.509232 | NA | NA | 0.99 | HDI | NA | NA | AME x |
Here is the example again, this time integrating out the random effects, likely the more appropriate estimate for most use cases.
ame2b.lsme <- brmsmargins(
ls.me,
at = data.frame(x = c(0, 1)),
contrasts = cbind("AME x" = c(-1, 1)),
dpar = "sigma",
effects = "integrateoutRE", k = 100L, seed = 1234)
knitr::kable(ame2b.lsme$ContrastSummary)
M | Mdn | LL | UL | PercentROPE | PercentMID | CI | CIType | ROPE | MID | Label |
---|---|---|---|---|---|---|---|---|---|---|
0.7127988 | 0.6795497 | 0.3513727 | 1.382269 | NA | NA | 0.99 | HDI | NA | NA | AME x |
This also is relatively close calculating all the individual standard deviations and taking their differences, then averaging.
dmixed[, .(SD = sd(y)), by = .(ID, x)
][, .(SDdiff = diff(SD)), by = ID][, mean(SDdiff)]
#> [1] 0.6281889