By default, the package uses a pooled logistic regression model for survival outcomes, logistic regression model for binary end-of-follow-up outcomes, and a linear regression model for continuous end-of-follow-up outcomes. Starting from version 1.1.0, the package allows users to apply their own type of outcome models. This document describes how to specify such custom outcome models. This document assumes that readers have read the long-form package documentation of McGrath et al. (2020).
To specify custom outcome models, users must provide functions that fit the outcome model and obtain estimates from the fitted model through the parameters and , respectively, in the function.
The function for fitting the outcome model must take the parameters and . Below, we illustrate a function for fitting an outcome model using a random forest. This code uses the package.
ymodel_fit_custom <- function(ymodel, obs_data){
return(randomForest::randomForest(formula = ymodel, data = obs_data))
}
The function for obtaining estimates from the model must take the parameters (the fitted outcome model) and (a containing the simulated dataset at time \(t\)). This function must return the estimated probability of the outcome for survival and binary end-of-follow-up outcomes or the estimated mean of the outcome for continuous end-of-follow-up outcomes in . Continuing with the random forest example, the code below obtains the estimated outcome mean for a continuous end-of-follow-up outcome. This code leverages the function in the package.
We perform an analysis similar to that Example 3 in McGrath et al. (2020), except we use the custom outcome model from the previous section.
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
id <- 'id'
time_name <- 't0'
covnames <- c('L1', 'L2', 'A')
outcome_name <- 'Y'
outcome_type <- 'continuous_eof'
covtypes <- c('categorical', 'normal', 'binary')
histories <- c(lagged)
histvars <- list(c('A', 'L1', 'L2'))
covparams <- list(covmodels = c(L1 ~ lag1_A + lag1_L1 + L3 + t0 +
rcspline.eval(lag1_L2, knots = c(-1, 0, 1)),
L2 ~ lag1_A + L1 + lag1_L1 + lag1_L2 + L3 + t0,
A ~ lag1_A + L1 + L2 + lag1_L1 + lag1_L2 + L3 + t0))
ymodel <- Y ~ A + L1 + L2 + lag1_A + lag1_L1 + lag1_L2 + L3
intervention1.A <- list(static, rep(0, 7))
intervention2.A <- list(static, rep(1, 7))
int_descript <- c('Never treat', 'Always treat')
nsimul <- 10000
gform_cont_eof <- gformula(obs_data = continuous_eofdata,
id = id, time_name = time_name,
covnames = covnames, outcome_name = outcome_name,
outcome_type = outcome_type, covtypes = covtypes,
covparams = covparams, ymodel = ymodel,
ymodel_fit_custom = ymodel_fit_custom,
ymodel_predict_custom = ymodel_predict_custom,
intervention1.A = intervention1.A,
intervention2.A = intervention2.A,
int_descript = int_descript,
histories = histories, histvars = histvars,
basecovs = c("L3"), nsimul = nsimul, seed = 1234)
gform_cont_eof
## PREDICTED RISK UNDER MULTIPLE INTERVENTIONS
##
## Intervention Description
## 0 Natural course
## 1 Never treat
## 2 Always treat
##
## Sample size = 2500, Monte Carlo sample size = 10000
## Number of bootstrap samples = 0
## Reference intervention = natural course (0)
##
##
## k Interv. NP mean g-form mean Mean ratio Mean difference % Intervened On
## <num> <num> <num> <num> <num> <num> <num>
## 6 0 -4.414543 -4.336936 1.000000 0.0000000 0.0
## 6 1 NA -3.118344 0.719020 1.2185924 100.0
## 6 2 NA -4.597293 1.060032 -0.2603564 74.2
## Aver % Intervened On
## <num>
## 0.00000
## 81.78714
## 17.61286
McGrath S, Lin V, Zhang Z, Petito LC, Logan RW, Hernán MA, Young JG. gfoRmula: an R package for estimating the effects of sustained treatment strategies via the parametric g-formula. Patterns. 2020 Jun 12;1(3).