GitHub R package version CRAN status GitHub R-CMD-check Codecov test coverage

survML: Tools for Flexible Survival Analysis Using Machine Learning

Note: The current development version of survML now has functionality for estimating variable importance and for estimating a covariate-adjusted survival curve under current status sampling, in addition to the original survival stacking functionality that was included in versions 1.1.0 and earlier. A new version on CRAN is forthcoming.

The survML package contains a variety of functions for analyzing survival data using machine learning. These include:

  1. Global and local survival stacking: Use off-the-shelf machine learning tools to estimate conditional survival functions.

  2. Algorithm-agnostic variable importance: Use debiased machine learning to estimate and make inference on variable importance for prediction of time-to-event outcomes.

  3. Current-status isotonic regression: Use isotonic regression to estimate the covariate-adjusted survival function of a time-to-event outcome under current status sampling.

See the package vignettes and function reference for more details.

Installing survML

You can install a stable version of survML from CRAN using

install.packages("survML")

Alternatively, the development version of survML is available on GitHub. You can install it using the devtools package as follows:

## install.packages("devtools") # run only if necessary
install_github(repo = "cwolock/survML")

Integration with CFsurvival

The CFsurvival package can be used to estimate a covariate-adjusted counterfactual survival curve from observational data. This approach requires estimating the conditional event and censoring distributions. In this fork of the CFsurvival package, we have added stackG() from survML as an option for estimating these nuisance parameters.

Documentation

Full documentation can be found on the survML website at https://cwolock.github.io/survML/.

Bugs reports and feature requests

To submit a bug report or request a new feature, please submit a new GitHub Issue.

References

For details of the methods implemented in this package, please see the following papers:

Local survival stacking is described in:

Citation

After using the survML package for conditional survival estimation, please cite the following:

@article{wolock2024framework,
        title={A framework for leveraging machine learning tools to estimate personalized survival curves},
        author={Wolock, Charles J and Gilbert, Peter B and Simon, Noah and Carone, Marco},
        journal={Journal of Computational and Graphical Statistics},
        year={2024},
        volume = {33},
        number = {3},
        pages = {1098--1108},
        publisher={Taylor \& Francis},
        doi={10.1080/10618600.2024.2304070}
}

After using the variable importance functions, please cite the following:

@article{wolock2023assessing,
         title={Assessing variable importance in survival analysis using machine learning},
         author={Wolock, Charles J and Gilbert, Peter B and Simon, Noah and Carone, Marco},
         journal={arXiv preprint arXiv:2311.12726},
         year={2023}
}

After using the functionality for current status data, please cite the following:

@article{wolock2024investigating,
  title={Investigating symptom duration using current status data: a case study of post-acute COVID-19 syndrome},
  author={Wolock, Charles J and Jacob, Susan and Bennett, Julia C and Elias-Warren, Anna and O'Hanlon, Jessica and Kenny, Avi and Jewell, Nicholas P and Rotnitzky, Andrea and Weil, Ana A and Chu, Helen Y and Carone, Marco},
  journal={arXiv preprint arXiv:2407.04214},
  year={2024}
}