Symbols and Notation

Minor changes expected!

This page is a work in progress and minor changes will be made over time.

The most common symbols and notation used throughout this book are presented below; in rare cases where different meanings are intended within the book, this will be made clear.

A lower-case letter in normal font, \(x\), refers to a single, fixed observation. When in bold font, a lower-case letter, \(\mathbf{x}\), refers to a vector of fixed observations, and an upper-case letter, \(\mathbf{X}\), represents a matrix. A letter in normal font with a single subscript, refers to a single element from a vector, for example \(x_i\) would be the \(i\)th element in vector \(\mathbf{x}\), whereas two subscripts refer to a single element from a matrix, for example \(x_{ij}\) would be the element in the \(i\)th row and \(j\)th column of matrix \(\mathbf{X}\). When referring to a row of a matrix, \(\mathbf{X}\), then a subscript is used with a bold-face lower-case letter, for example the \(i\)th row would be \(\mathbf{x}_i\), in contrast the \(j\)th column would be written as \(\mathbf{x}_{;j}\). Calligraphic letters, \(\mathcal{X}\), are used to denote sets.

A matrix will always be defined with its dimensions using the notation, \(\mathbf{X}\in \mathcal{X}^{n \times p}\), or if \(\mathcal{X}\) is the set of Reals, it may be written as “\(\mathbf{X}\) is a \(n \times p\) Real-valued matrix”. By default, a ‘vector’ will refer to a column vector, which may be thought of as a matrix with \(n\) rows and one column, and may be represented as:

\[ \mathbf{x}= \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]

Vectors are usually defined using transpose notation, for example the vector above may instead be written as \(\mathbf{x}^\top= (x_1 \ x_2 \cdots x_n)\) or \(\mathbf{x}= (x_1 \ x_2 \cdots x_n)^\top\). Vectors may also be defined in a shortened format as, \(\mathbf{x}\in \mathcal{X}^n\), which implies a vector of length \(n\) with elements as represented above.

Typically, a ‘hat’, \(\hat{x}\), will refer to the prediction or estimation of a variable, \(x\), with bold-face used again to represent vectors. A ‘bar’, \(\bar{x}\), refers to the sample mean of \(\mathbf{x}\). Capital letters in normal font, \(X\), refer to scalar or vector random variables, which will be made clear from context. \(\mathbb{E}(X)\) and \(\operatorname{Var}(X)\) are the expectation and variance of the random variable \(X\) respectively. We write \(A \perp \!\!\! \perp B\), to denote that \(A\) and \(B\) are independent, i.e., that \(P(A \cap B) = P(A)P(B)\).

A function \(f\), will either be written as a formal map of domain to codomain, \(f: \mathcal{X}\rightarrow \mathcal{Y}; (x, y) \mapsto f(x, y)\) (which is most useful for understanding inputs and outputs), or more simply and commonly as \(f(x, y)\). Given a random variable, \(X\), following distribution \(\zeta\) (mathematically written \(X \sim \zeta\)), then \(f_X\) denotes the probability density function, and analogously for other distribution defining functions. In the survival analysis context (4  Survival Analysis), a subscript “\(0\)” refers to a “baseline” function, for example, \(S_0\) is the baseline survival function.

Common variables and acronyms used in the book are given in Table 1 and Table 2 respectively.

Table 1: Common variables used throughout the book.
Variable Definition
\(\mathbb{R}, \mathbb{R}_{>0}, \mathbb{R}_{\geq 0}, \bar{\mathbb{R}}\) Set of Reals, positive Reals (excl. zero), non-negative Reals (incl. zero), and Reals including \(\pm\infty\).
\(\mathbb{N}_{> 0}\) Set of Naturals excluding zero.
\((\mathbf{X}, \mathbf{t}, \boldsymbol{\delta})\) Survival data where \(\mathbf{X}\) is an \(n \times p\) matrix of features, \(\mathbf{t}\) is a vector of observed outcome times, and \(\boldsymbol{\delta}\) is a vector of observed outcome indicators.
\(\boldsymbol{\beta}\) Vector of model coefficients/weights.
\(\boldsymbol{\eta}\) Linear predictor, \(X\boldsymbol{\beta}\).
\(\mathcal{D}, \mathcal{D}_{train}, \mathcal{D}_{test}\) Dataset, training data, and testing data.
Table 2: Common acronyms used throughout the book.
Acronym Definition
AFT Accelerated Failure Time
cdf Cumulative Distribution Function
chf Cumulative Hazard Function
CPH Cox Proportional Hazards
GBM Gradient Boosting Machine
GLM Generalised Linear Model
IPC(W) Inverse Probability of Censoring (Weighted)
ML Machine Learning
pdf Probability Density Function
PH Proportional Hazards
(S)SVM (Survival) Support Vector Machine
t.v.i. Taking Values In