1 Introduction
This page is a work in progress and major changes will be made over time.
TODO
- Mention somewhere that SA can be used to solve T-year prediciton problems (i.e., see if we can get classif users over to SA).
- Also SA can be used for censoring/truncation
Writing after a global pandemic, applications of survival analysis are more relevant than ever. Predicting the time from onset of COVID-19 symptoms to hospitalisation, or the time from hospitalisation to intubation, or intubation to death, are all time-to-event predictions that are at the centre of survival analysis. As well as morbid applications, survival analysis predictions may be concerned with predicting the time until a customer cancels their gym membership, or the lifetime of a lightbulb; any event that is guaranteed (or at least very likely) to occur can be modelled by a survival analysis prediction. As these predictions can be so sensitive, for example a model predicting when a child should be taken off breathing support (Data Study Group Team 2020), the best possible predictions, evaluated to the highest standard, are a necessity. In other fields of predictive modelling, machine learning has made incredible breakthroughs (such as AlphaFold), therefore applying machine learning to survival analysis is a natural step in the evolution of an important field.
Survival analysis is the field of Statistics focusing on modelling the distribution of an event, which may mean the time until the event takes place, the risk of the event happening, the probability of the event occurring at a single time, or the event’s underlying probability distribution. Survival analysis (‘survival’) is a unique field of study in Statistics as it includes the added difficulty of ‘censoring’. Censoring is best described through example: a study is conducted to determine the mortality rate of a group of patients after diagnoses with a particular disease. If a patient dies during this study then their outcome is ‘death’ and their time of death can be recorded. However if a patient drops-out of the study before they die, then their time of death (though guaranteed to occur) is unknown and the only available information is the time at which they left the study. This patient is now said to be censored at the time they drop out. The censoring mechanism allows as much outcome information (time and event) to be captured as possible for all patients (observations).
Machine learning (ML) is the field of Statistics primarily concerned with building models to either predict outputs from inputs or to learn relationships from data (Hastie, Tibshirani, and Friedman 2001; James et al. 2013). This book is limited to the former case, or more specifically supervised learning, as this is the field in which the vast majority of survival problems live. Relative to other areas of supervised learning, development in survival analysis has been slow – the majority of developments in machine learning for survival analysis have only been in the past decade (see chapters (?sec-review)-(Chapter 7)). This appears to have resulted in less interest in the development of machine learning survival models (?sec-review), less rigour in the evaluation of such models (Chapter 7), and fewer off-shelf/open-source implementations (Sonabend et al. 2021). This book seeks to set the foundations for clear workflows, good practice, and precise results for ‘machine learning survival analysis’.
1.1 Why is this book needed?
Firstly, whilst there are many books dedicated to regression and classification as machine learning problems (the ‘bibles’ of machine learning focus entirely on regression and classification only (Bishop 2006; Hastie, Tibshirani, and Friedman 2001; James et al. 2013)), there is a deficit of books covering the survival analysis setting. By writing this book we hope to fill this gap and enable more practitioners to use cutting-edge methods in survival analysis. Survival analysis has important applications in healthcare, finance, engineering and more, all fields that directly impact upon individual lives on a day-to-day basis, and should perhaps be considered as important as classification and regression. The result of this gap in interest, is the erroneous assumption that one field can be directly applied to another. For example there is evidence of researchers treating censoring as a nuisance to be ignored and using regression models instead (Schwarzer, Vach, and Schumacher 2010). Censoring is indeed a challenge and may contribute to making survival analysis less accessible than other fields, but this need not be the case; a clear unification of terminology and presentation of methods may help make ‘machine learning survival analysis’ more accessible. Added accessibility could lead to more academics (and non-academics) engaging with the field and promoting good standards of practice, as well as developing more novel models and measures.
Where survival models have been developed, these have skewed towards ‘ranking models’, which predict the relative risk of an event occurring (Section 6.1). In many applications these predictions are sufficient, for example in randomised control trials if assessing the increased/decreased risk of an event after treatment. However, there are many use-cases where predicting an individual’s survival probability distribution is required. Take, for example, an engineer calculating the lifetime of a plane’s engine.1 There are three important reasons to replace a jet engine at the optimal time:
1 In this engineering context, survival analysis is usually referred to as reliability analysis.
- financial: jet engines are very expensive and replacing one sooner than required is a waste of money;
- environmental: an engine being replaced too early is a waste of potential usage;
- safety: if the engine is replaced too late then there is a risk to passengers.
Now consider examples for the three possible ‘prediction types’ the engineer can make:
- A ‘relative risk prediction’: This engine is twice as likely to fail as another.
- A ‘survival time prediction’: The engine is expected to fail in 30 days.
- A ‘survival distribution prediction’: The lifetime of the engine is distributed according to the probability distribution \(\zeta\).
The first prediction type is not useful as the underlying relative risk may be unknown and the engineer is concerned with the individual lifetime. The second prediction type provides a useful quantity for the engineer to work with however there is no uncertainty captured in this prediction. The third prediction type can capture the uncertainty of failure over the entirety of the positive Reals (though usually only a small subset is possible and useful). With this final prediction type, the engineer can create safe decisions: ‘replace the engine at time \(\tau\), where \(\tau\) is the time when the predicted probability of survival drops below 60%, \(S(\tau) = 0.6\)’. There are ethical, economic, and environmental reasons for a good survival distribution prediction and this book considers a distribution prediction to be the most important prediction type.
Evaluating predictions from survival models is of the utmost importance. This is especially important as survival models are often deployed in the public domain, particularly in healthcare. Physical products in healthcare, such as new vaccines, undergo rigorous testing and research in randomised control trials before being publically deployed; the same level of rigour should be expected for the evaluation of survival models that are used in life-and-death situations. Evaluation measures for regression and classification are well-understood with important properties, however survival measures have not undergone the same treatment. For example many survival models are still being evaluated solely with concordance indices that have been repeatedly criticised (Gönen and Heller 2005; Rahman et al. 2017; Schmid and Potapov 2012).
1.2 Reproducibility
This book includes simulations and figures generated in \(\textsf{R}\), the code for any figures or experiments in this book are freely available at https://github.com/mlsa-book/MLSA under an MIT licence and all content on this website is available under CC BY 4.0.