Symbols and Notation

Minor changes expected!

This page is a work in progress and minor changes will be made over time.

The most common symbols and notation used throughout this book are presented below; in rare cases where different meanings are intended within the book, this will be made clear.

Fonts, matrices, vectors

A lower-case letter in normal font, \(x\), refers to a single, fixed observation. When in bold font, a lower-case letter, \(\mathbf{x}\), refers to a vector of fixed observations, and an upper-case letter, \(\mathbf{X}\), represents a matrix. Calligraphic letters, \(\mathcal{X}\), are used to denote sets.

A matrix will always be defined with its dimensions using the notation, \(\mathbf{X}\in \mathcal{X}^{n \times p}\), or if for example \(\mathcal{X}\) is the set of Reals, it may be written as “\(\mathbf{X}\) is a \(n \times p\) Real-valued matrix”, analogously for integer-valued matrices etc. By default, a ‘vector’ will refer to a column vector, which may be thought of as a matrix with \(n\) rows and one column, and may be represented as:

\[ \mathbf{x}= \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix} \]

Vectors are usually defined using transpose notation, for example the vector above may instead be written as \(\mathbf{x}^\intercal= (x_1 \ x_2 \cdots x_n)\) or \(\mathbf{x}= (x_1 \ x_2 \cdots x_n)^\intercal\). Vectors may also be defined in a shortened format as, \(\mathbf{x}\in \mathcal{X}^n\), which implies a vector of length \(n\) with elements as represented above.

A letter in normal font with one subscript refers to a single element from a vector. For example, given \(\mathbf{x}\in \mathcal{X}^n\), the \(i\)th element is denoted \(x_i\). Given a matrix \(\mathbf{X}\in \mathcal{X}^{n \times p}\), a bold-face lower-case letter with a single subscript refers to the row of a matrix, for example the \(i\)th row would be \(\mathbf{x}_i = (x_{i;1} \ x_{i;2} \cdots x_{i;p})^\intercal\). Whereas a column is referenced with a semi-colon before the subscript, for example the \(j\)th column would be \(\mathbf{x}_{;j} = (x_{1;j} \ x_{2;j} \cdots x_{n;j})^\intercal\). Two subscripts can be used to reference a single element of a matrix, for example \(x_{i;j}\) would be the element in the \(i\)th row and \(j\)th column of \(\mathbf{X}\).

Functions

Typically, a ‘hat’, \(\hat{x}\), will refer to the prediction or estimation of a variable, \(x\), with bold-face used again to represent vectors. A ‘bar’, \(\bar{x}\), refers to the sample mean of \(\mathbf{x}\). Capital letters in normal font, \(X\), refer to scalar or vector random variables, which will be made clear from context. \(\mathbb{E}(X)\) and \(\operatorname{Var}(X)\) are the expectation and variance of the random variable \(X\) respectively. We write \(A \perp \!\!\! \perp B\), to denote that \(A\) and \(B\) are independent, i.e., that \(P(A \cap B) = P(A)P(B)\).

A function \(f\), will either be written as a formal map of domain to codomain, \(f: \mathcal{X}\rightarrow \mathcal{Y}; (x, y) \mapsto f(x, y)\) (which is most useful for understanding inputs and outputs), or more simply and commonly as \(f(x, y)\). Given a random variable, \(X\), following distribution \(\zeta\) (mathematically written \(X \sim \zeta\)), then \(f_X\) denotes the probability density function, and analogously for other distribution defining functions. In the survival analysis context (4  Survival Analysis), a subscript “\(0\)” refers to a “baseline” function, for example, \(S_0\) is the baseline survival function.

Variables and acronyms

Common variables and acronyms used in the book are given in Table 1 and Table 2 respectively.

Table 1: Common variables used throughout the book.
Variable Definition
\(\mathbb{R}, \mathbb{R}_{>0}, \mathbb{R}_{\geq 0}, \bar{\mathbb{R}}\) Set of Reals, positive Reals (excl. zero), non-negative Reals (incl. zero), and Reals including \(\pm\infty\).
\(\mathbb{N}_{> 0}\) Set of Naturals excluding zero.
\((\mathbf{X}, \mathbf{t}, \boldsymbol{\delta})\) Survival data where \(\mathbf{X}\in \mathbb{R}\) is a real-valued matrix of observations (rows) and features (columns), \(\mathbf{t}\) is a vector of observed outcome times, and \(\boldsymbol{\delta}\) is a vector of observed outcome indicators.
\(\boldsymbol{\beta}\) Vector of model coefficients/weights.
\(\boldsymbol{\eta}\) Vector of linear predictors, \(\boldsymbol{\eta} = ({\eta}_1 \ {\eta}_2 \cdots {\eta}_{n})^\intercal\), where \(\boldsymbol{\eta}= \mathbf{X}\boldsymbol{\beta}\) and \(\eta_i = \mathbf{x}_{i}^\intercal\boldsymbol{\beta}\).
\(\mathcal{D}, \mathcal{D}_{train}, \mathcal{D}_{test}\) Dataset, training data, and testing data.
Table 2: Common acronyms used throughout the book.
Acronym Definition
AFT Accelerated Failure Time
cdf Cumulative Distribution Function
chf Cumulative Hazard Function
CPH Cox Proportional Hazards
GBM Gradient Boosting Machine
GLM Generalised Linear Model
IPC(W) Inverse Probability of Censoring (Weighted)
ML Machine Learning
pdf Probability Density Function
PH Proportional Hazards
(S)SVM (Survival) Support Vector Machine
t.v.i. Taking Values In