12 Machine Learning Survival Models

Abstract

TODO (150-200 WORDS)

Major changes expected!

This page is a work in progress and major changes will be made over time.

12.1 A Survey of Machine Learning Models for Survival Analysis

These next sections provide a technical, critical survey of machine learning models proposed for survival analysis with the focus on the ‘simpler’ setup of non-competing risks. Models are separated into their different ‘classes’ (1), which exists as a natural taxonomy in machine learning. Each class review is then further separated by first discussing the simpler and more standard regression setting, before expanding into their survival framework. The focus is once again on the different predict types of the model, which enables clear exposition and discussion around how some areas have successfully dealt with the survival predictive problem, whereas others have fallen short.

This is not the first survey of machine learning models for survival analysis. A recent 2017 survey (P. Wang, Li, and Reddy 2019) focused on covering the breadth of machine learning models for survival analysis and this survey is recommended to the reader as a strong starting point to understand which ML models are available for survival analysis. However whilst this provides a comprehensive review and a ‘big-picture’ view, there is no discussion about how successful the discussed models are in solving the survival task.

A comprehensive survey of neural networks was presented by Schwarzer \(\textit{et al.}\) (2000) (Schwarzer, Vach, and Schumacher 2010) in which the authors collected the many ways in which neural networks have been ‘misused’ in the context of survival analysis. This level of criticism is vital in the context of survival analysis and healthcare data as transparency and understanding are often prioritised over predictive performance. Whilst the survey in this book will try not to be as critical as the Schwarzer review, it will aim to discuss models and how well they actually solve the survival problem.

Historically, surveys have focused primarily on predictive performance, which is generally preferred for complex classification and regression tasks. However in the context of survival analysis, transparency is of the utmost importance and any model that does not solve the task it claims to, despite strong predictive performance, can be considered sub-optimal. The survey will also examine the accessibility of survival models. A model need not be open-source to be accessible, but it should be ‘user-friendly’ and not require expert cross-domain knowledge. For example, a neural network may require knowledge of complex model building, but if set-up correctly could be handled without medical or survival knowledge. Whereas a Gaussian Process requires knowledge of the model class, simulation, (usually) Bayesian modelling, and also survival analysis.

provides information about the models reviewed in this survey, including a model reference for use in the (R. E. B. Sonabend 2021) benchmark experiment, the predict types of the model, and in which \(\textsf{R}\) package it is implemented.

Table 12.1: Summarising the models discussed in (Section 12.1) by their model class and respective survival task.

Class\(^1\)	Name\(^2\)	Authors (Year)\(^3\)	Task\(^4\)	Implementation\(^5\)
RF	RRT	LeBlanc and Crowley (1992) (LeBlanc and Crowley 1992)	Rank	\(\textbf{rpart}\) (Therneau and Atkinson 2019)
RF	RSDF-DEV	Hothorn \(\textit{et al.}\) (2004) (Hothorn et al. 2004)	Prob.	\(\textbf{ipred}\) (Peters and Hothorn 2019)
RF	RRF	Ishwaran \(\textit{et al.}\) (2004) (H. Ishwaran et al. 2004)	Rank	-
RF	RSCIFF	Hothorn \(\textit{et al.}\) (2006) (Hothorn et al. 2005)	Det., Prob.	\(\textbf{party}\) (Hothorn, Hornik, and Zeileis 2006), \(\textbf{partykit}\) (Hothorn and Zeileis 2015)
RF	RSDF-STAT	Ishwaran \(\textit{et al.}\) (2008) (B. H. Ishwaran et al. 2008)	Prob.	\(\textbf{randomForestSRC}\) (H. Ishwaran and Kogalur 2018), \(\textbf{ranger}\) (Wright and Ziegler 2017)
GBM	GBM-COX	Ridgeway (1999) (Ridgeway 1999) & Buhlmann (2007) (Buhlmann and Hothorn 2007)	Prob.	\(\textbf{mboost}\) (Hothorn et al. 2020), \(\textbf{xgboost}\) (Chen et al. 2020), \(\textbf{gbm}\) (Greenwell et al. 2019)
GBM	CoxBoost	Binder & Schumacher (2008) (Harald Binder and Schumacher 2008)	Prob.	\(\textbf{CoxBoost}\) (Harold Binder 2013)
GBM	GBM-AFT	Schmid & Hothorn (2008) (Schmid and Hothorn 2008)	Det.	\(\textbf{mboost}\), \(\textbf{xgboost}\)
GBM	GBM-BUJAR	Wang & Wang (2010) (Z. Wang and Wang 2010)	Det.	\(\textbf{bujar}\) (Z. Wang 2019)
GBM	GBM-GEH	Johnson & Long (2011) (Johnson and Long 2011)	Det.	\(\textbf{mboost}\)
GBM	GBM-UNO	Mayr & Schmid (2014) (Mayr and Schmid 2014)	Rank	\(\textbf{mboost}\)
SVM	SVCR	Shivaswamy \(\textit{et al.}\) (2007) (Shivaswamy, Chu, and Jansche 2007)	Det.	\(\textbf{survivalsvm}\) (Fouodo et al. 2018)
SVM	SSVM-Rank	Van Belle \(\textit{et al.}\) (2007) (Van Belle et al. 2007)	Rank	\(\textbf{survivalsvm}\)
SVM	SVRc	Khan and Zubek (2008) (Khan and Bayer Zubek 2008)	Det.	-
SVM	SSVM-Hybrid	Van Belle (2011) (Van Belle et al. 2011)	Det.	\(\textbf{survivalsvm}\)
SVM	SSVR-MRL	Goli \(\textit{et al.}\) (2016) (Goli, Mahjub, Faradmal, and Soltanian 2016; Goli, Mahjub, Faradmal, Mashayekhi, et al. 2016)	Det.	-
ANN	ANN-CDP	Liestl \(\textit{et al.}\) (1994) (Liestol, Andersen, and Andersen 1994)	Prob.	-
ANN	ANN-COX	Faraggi and Simon (1995) (Faraggi and Simon 1995)	Rank	-
ANN	PLANN	Biganzoli \(\textit{et al.}\) (1998) (Biganzoli et al. 1998)	Prob.	-
ANN	COX-NNET	Ching \(\textit{et al.}\) (2018) (Ching, Zhu, and Garmire 2018)	Prob.	\(^*\) (Ching 2015)
ANN	DeepSurv	Katzman \(\textit{et al.}\) (2018) (Katzman et al. 2018)	Prob.	\(\textbf{survivalmodels}\) (R. Sonabend 2020)
ANN	DeepHit	Lee \(\textit{et al.}\) (2018) (Lee et al. 2018)	Prob.	\(\textbf{survivalmodels}\)
ANN	Nnet-survival	Gensheimer & Narasimhan (2019) (Gensheimer and Narasimhan 2019)	Prob.	\(\textbf{survivalmodels}\)
ANN	Cox-Time	Kvamme \(\textit{et al.}\) (2019) (Kvamme, Borgan, and Scheel 2019)	Prob.	\(\textbf{survivalmodels}\)
ANN	PC-Hazard	Kvamme & Borgan (2019) (Kvamme2019?)	Prob.	\(\textbf{survivalmodels}\)
ANN	RankDeepSurv	Jing \(\textit{et al.}\) (2019) (Jing et al. 2019)	Det.	\(\textbf{RankDeepSurv}\)\(^{\ast, \dagger}\) (Jing et al. 2018)
ANN	DNNSurv	Zhao & Fend (2020) (Zhao and Feng 2020)	Prob.	\(\textbf{survivalmodels}\)

^{* 1. Model Class. RSF – Random Survival Forest; GBM – Gradient Boosting Machine; SVM – Support Vector Machine; ANN – Artificial Neural Network. There is some abuse of notation here as some of the RSFs are actually decision trees and some GBMs do not use gradient boosting. * 2. Model identifier used in this section and (R. E. B. Sonabend 2021). * 3. Authors and year of publication, for RSFs this is the paper most attributed to the algorithm. * 4. Survival task type: Deterministic (Det.), Probabilistic (Prob.), Ranking (Rank). * 5. If available in \(\textsf{R}\) then the package in which the model is implemented, otherwise `\(\ast\)’ signifies a model is only available in Python. With the exception of DNNSurv, all ANNs in \(\textbf{survivalmodels}\) are implemented from the Python package \(\textbf{pycox}\) (Kvamme 2018) with \(\textbf{reticulate}\) (Ushey, Allaire, and Tang 2020). * \(\dagger\) – Code available to create model but not implemented ‘off-shelf’.}

Biganzoli, Elia, Patrizia Boracchi, Luigi Mariani, and Ettore Marubini. 1998. “Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach.” Statistics in Medicine 17 (10): 1169–86. https://doi.org/10.1002/(SICI)1097-0258(19980530)17:10<1169::AID-SIM796>3.0.CO;2-D.

Binder, Harald, and Martin Schumacher. 2008. “Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models.” BMC Bioinformatics 9 (1): 14. https://doi.org/10.1186/1471-2105-9-14.

Binder, Harold. 2013. “CoxBoost: Cox models by likelihood based boosting for a single survival endpoint or competing risks.” CRAN.

Buhlmann, Peter, and Torsten Hothorn. 2007. “Boosting Algorithms: Regularization, Prediction and Model Fitting.” Statist. Sci. 22 (4): 477–505. https://doi.org/10.1214/07-STS242.

Chen, Tianqi, Tong He, Michael Benesty, Vadim Khotilovich, Yuan Tang, Hyunsu Cho, Kailong Chen, et al. 2020. “xgboost: Extreme Gradient Boosting.” CRAN. https://cran.r-project.org/package=xgboost.

Ching, Travers. 2015. “Cox-Nnet.” https://github.com/lanagarmire/cox-nnet.

Ching, Travers, Xun Zhu, and Lana X Garmire. 2018. “Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data.” PLOS Computational Biology 14 (4): e1006076. https://doi.org/10.1371/journal.pcbi.1006076.

Faraggi, David, and Richard Simon. 1995. “A neural network model for survival data.” Statistics in Medicine 14 (1): 73–82. https://doi.org/10.1002/sim.4780140108.

Fouodo, Cesaire J K, I Konig, C Weihs, A Ziegler, and M Wright. 2018. “Support vector machines for survival analysis with R.” The R Journal 10 (July): 412–23.

Gensheimer, Michael F, and Balasubramanian Narasimhan. 2019. “A scalable discrete-time survival model for neural networks.” PeerJ 7: e6257.

Goli, Shahrbanoo, Hossein Mahjub, Javad Faradmal, Hoda Mashayekhi, and Ali-Reza Soltanian. 2016. “Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression.” Edited by Francesco Pappalardo. Computational and Mathematical Methods in Medicine 2016: 2157984. https://doi.org/10.1155/2016/2157984.

Goli, Shahrbanoo, Hossein Mahjub, Javad Faradmal, and Ali-Reza Soltanian. 2016. “Performance Evaluation of Support Vector Regression Models for Survival Analysis: A Simulation Study.” International Journal of Advanced Computer Science and Applications 7 (June). https://doi.org/10.14569/IJACSA.2016.070650.

Greenwell, Brandon, Bradley Boehmke, Jay Cunningham, and. GBM Developers. 2019. “gbm: Generalized Boosted Regression Models.” CRAN. https://cran.r-project.org/package=gbm.

Hothorn, Torsten, Peter Buehlmann, Thomas Kneib, Matthias Schmid, and Benjamin Hofner. 2020. “mboost: Model-Based Boosting.” CRAN. https://cran.r-project.org/package=mboost.

Hothorn, Torsten, Peter Bühlmann, Sandrine Dudoit, Annette Molinaro, and Mark J Van Der Laan. 2005. “Survival ensembles.” Biostatistics 7 (3): 355–73. https://doi.org/10.1093/biostatistics/kxj011.

Hothorn, Torsten, Kurt Hornik, and Achim Zeileis. 2006. “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics 15 (3): 651—–674.

Hothorn, Torsten, Berthold Lausen, Axel Benner, and Martin Radespiel-Tröger. 2004. “Bagging survival trees.” Statistics in Medicine 23 (1): 77–91. https://doi.org/10.1002/sim.1593.

Hothorn, Torsten, and Achim Zeileis. 2015. “partykit: A Modular Toolkit for Recursive Partytioning in R.” Journal of Machine Learning Research 16: 3905–9. http://jmlr.org/papers/v16/hothorn15a.html.

Ishwaran, By Hemant, Udaya B Kogalur, Eugene H Blackstone, and Michael S Lauer. 2008. “Random survival forests.” The Annals of Statistics 2 (3): 841–60. https://doi.org/10.1214/08-AOAS169.

Ishwaran, Hemant, Eugene H Blackstone, Claire E Pothier, and Michael S Lauer. 2004. “Relative Risk Forests for Exercise Heart Rate Recovery as a Predictor of Mortality.” Journal of the American Statistical Association 99 (467): 591–600. https://doi.org/10.1198/016214504000000638.

Ishwaran, Hemant, and Udaya B Kogalur. 2018. “randomForestSRC.” https://cran.r-project.org/package=randomForestSRC.

Jing, Bingzhong, Tao Zhang, Zixian Wang, Ying Jin, Kuiyuan Liu, Wenze Qiu, Liangru Ke, et al. 2018. “RankDeepSurv.” https://github.com/sysucc-ailab/RankDeepSurv.

———, et al. 2019. “A deep survival analysis method based on ranking.” Artificial Intelligence in Medicine 98: 1–9. https://doi.org/https://doi.org/10.1016/j.artmed.2019.06.001.

Johnson, Brent A, and Qi Long. 2011. “Survival ensembles by the sum of pairwise differences with application to lung cancer microarray studies.” Ann. Appl. Stat. 5 (2A): 1081–101. https://doi.org/10.1214/10-AOAS426.

Katzman, Jared L, Uri Shaham, Alexander Cloninger, Jonathan Bates, Tingting Jiang, and Yuval Kluger. 2018. “DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network.” BMC Medical Research Methodology 18 (1): 24. https://doi.org/10.1186/s12874-018-0482-1.

Khan, Faisal M., and Valentina Bayer Zubek. 2008. “Support vector regression for censored data (SVRc): A novel tool for survival analysis.” Proceedings - IEEE International Conference on Data Mining, ICDM, 863–68. https://doi.org/10.1109/ICDM.2008.50.

Kvamme, Håvard. 2018. “Pycox.” https://pypi.org/project/pycox/.

Kvamme, Håvard, Ørnulf Borgan, and Ida Scheel. 2019. “Time-to-event prediction with neural networks and Cox regression.” Journal of Machine Learning Research 20 (129): 1–30.

LeBlanc, Michael, and John Crowley. 1992. “Relative Risk Trees for Censored Survival Data.” Biometrics 48 (2): 411–25. https://doi.org/10.2307/2532300.

Lee, Changhee, William Zame, Jinsung Yoon, and Mihaela Van der Schaar. 2018. “DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks.” Proceedings of the AAAI Conference on Artificial Intelligence 32 (1). https://doi.org/10.1609/aaai.v32i1.11842.

Liestol, Knut, Per Kragh Andersen, and Ulrich Andersen. 1994. “Survival analysis and neural nets.” Statistics in Medicine 13 (12): 1189–1200. https://doi.org/10.1002/sim.4780131202.

Mayr, Andreas, and Matthias Schmid. 2014. “Boosting the concordance index for survival data–a unified framework to derive and evaluate biomarker combinations.” PloS One 9 (1): e84483–83. https://doi.org/10.1371/journal.pone.0084483.

Peters, Andrea, and Torsten Hothorn. 2019. “ipred: Improved Predictors.” CRAN. https://cran.r-project.org/package=ipred.

Ridgeway, Greg. 1999. “The state of boosting.” Computing Science and Statistics 31: 172—–181.

Schmid, Matthias, and Torsten Hothorn. 2008. “Flexible boosting of accelerated failure time models.” BMC Bioinformatics 9 (February): 269. https://doi.org/10.1186/1471-2105-9-269.

Schwarzer, Guido, Werner Vach, and Martin Schumacher. 2010. “Estimation of prediction error for survival models.” Statistics in Medicine 29 (2): 262–74. https://doi.org/10.1002/(SICI)1097-0258(20000229)19:4<541::AID-SIM355>3.0.CO;2-V.

Shivaswamy, Pannagadatta K., Wei Chu, and Martin Jansche. 2007. “A support vector approach to censored targets.” In Proceedings - IEEE International Conference on Data Mining, ICDM, 655–60. https://doi.org/10.1109/ICDM.2007.93.

Sonabend, Raphael. 2020. “survivalmodels: Models for Survival Analysis.” CRAN. https://raphaels1.r-universe.dev/ui#package:survivalmodels.

Sonabend, Raphael Edward Benjamin. 2021. “A Theoretical and Methodological Framework for Machine Learning in Survival Analysis: Enabling Transparent and Accessible Predictive Modelling on Right-Censored Time-to-Event Data.” PhD, University College London (UCL). https://discovery.ucl.ac.uk/id/eprint/10129352/.

Therneau, Terry M., and Beth Atkinson. 2019. “rpart: Recursive Partitioning and Regression Trees.” CRAN.

Ushey, Kevin, J J Allaire, and Yuan Tang. 2020. “reticulate: Interface to ’Python’.” CRAN. https://cran.r-project.org/package=reticulate.

Van Belle, Vanya, Kristiaan Pelckmans, Johan A. K. Suykens, and Sabine Van Huffel. 2007. “Support Vector Machines for Survival Analysis.” In In Proceedings of the Third International Conference on Computational Intelligence in Medicine and Healthcare. 1.

Van Belle, Vanya, Kristiaan Pelckmans, Sabine Van Huffel, and Johan A. K. Suykens. 2011. “Support vector methods for survival analysis: A comparison between ranking and regression approaches.” Artificial Intelligence in Medicine 53 (2): 107–18. https://doi.org/10.1016/j.artmed.2011.06.006.

Wang, Ping, Yan Li, and Chandan K. Reddy. 2019. “Machine Learning for Survival Analysis.” ACM Computing Surveys 51 (6): 1–36. https://doi.org/10.1145/3214306.

Wang, Zhu. 2019. “bujar: Buckley-James Regression for Survival Data with High-Dimensional Covariates.” CRAN. https://cran.r-project.org/package=bujar.

Wang, Zhu, and C Y Wang. 2010. “Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data.” Statistical Applications in Genetics and Molecular Biology 9 (1). https://doi.org/https://doi.org/10.2202/1544-6115.1550.

Wright, Marvin N., and Andreas Ziegler. 2017. “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77 (1): 1—–17.

Zhao, Lili, and Dai Feng. 2020. “Deep Neural Networks for Survival Analysis Using Pseudo Values.” IEEE Journal of Biomedical and Health Informatics 24 (11): 3308–14. https://doi.org/10.1109/JBHI.2020.2980204.