Essential Formulas and Theorems

A deep understanding of core statistical principles is crucial for modeling financial markets, pricing derivatives, and managing risk.

I. Core Probability Laws

These laws govern how probabilities are calculated and updated, forming the basis for statistical inference and decision-making under uncertainty.

Conditional Probability, Bayes' Theorem, and Law of Total Probability

Consider events $A_1, \dots, A_n$ which form a partition of the sample space (i.e., they are mutually exclusive and collectively exhaustive) and an event $B$ .

Concept	Formula	Description
Conditional Probability	$\mathbb{P}(A \mid B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}$	The probability of event $A$ occurring given that event $B$ has already occurred.
Law of Total Probability	$\mathbb{P}(B) = \sum_{i=1}^n \mathbb{P}(B \cap A_i) = \sum_{i=1}^n \mathbb{P}(B \mid A_i)\mathbb{P}(A_i)$	Used to find the marginal probability of an event $B$ when the sample space is partitioned.
Bayes' Theorem	$\mathbb{P}(A_1 \mid B) = \frac{\mathbb{P}(B \mid A_1)\mathbb{P}(A_1)}{\mathbb{P}(B)}$	Relates the posterior probability $\mathbb{P}(A_1 \mid B)$ to the prior $\mathbb{P}(A_1)$ and the likelihood $\mathbb{P}(B \mid A_1)$ . Relevance: Crucial for updating beliefs as new data arrives.

II. Moments and Relationships

Moments describe the shape and location of a probability distribution. Understanding their properties is key to manipulating random variables in models.

Law of the Unconscious Statistician (LOTUS)

The expected value of a function of a random variable $g(X)$ can be calculated without first finding the distribution of $Y=g(X)$ .

\mathbb{E}[g(X)] \stackrel{\text{continuous } X}{=} \int_{\mathbb{R}} g(x) f_X(x) dx \stackrel{\text{discrete } X}{=} \sum_{k \in \text{Supp}(X)} g(k) \mathbb{P}(X = k)

Law of Total Expectation and Variance

These laws are essential for models where one random variable depends on another (e.g., a two-stage process or a mixture model).

Concept	Formula	Description
Total Expectation	$\mathbb{E}[X] = \mathbb{E}[\mathbb{E}[X \mid Y]]$	The overall expected value of $X$ is the expected value of the conditional expectation of $X$ given $Y$ .
Total Variance	$\mathrm{Var}(X) = \mathrm{Var}(\mathbb{E}[X \mid Y]) + \mathbb{E}[\mathrm{Var}(X \mid Y)]$	The total variance is the sum of the variance of the conditional mean (between-group variance) and the mean of the conditional variance (within-group variance).

Intuitively, the Law of Total Expectation says that if we "average over all averages" of $X$ obtained by some information about $Y$ , we obtain the true average. Similarly, the Law of Total Variance says that the true variance comes from two sources: between samples (the first term) and within samples (the second term).

Covariance and Correlation

These measure the linear relationship between two random variables $X$ and $Y$ .

\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y]

\text{Corr}(X, Y) = \rho_{X,Y} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \quad \text{where } -1 \le \rho_{X,Y} \le 1

Key Properties of Variance and Covariance:

$\text{Var}(aX + b) = a^2\text{Var}(X)$
$\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X, Y)$
If $X$ and $Y$ are independent, $\text{Cov}(X, Y) = 0$ , and $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$ . Note: The converse is not always true (uncorrelated does not imply independent).

Common Relationships Between Distributions

Relationship	Formula	Relevance
Sum of Bernoullis	$X_1, \dots, X_n \sim \text{Bernoulli}(p) \text{ IID} \implies \sum_{i=1}^n X_i \sim \text{Binom}(n, p)$	Foundation of the Binomial Option Pricing Model.
Sum of Poissons	$X_i \sim \text{Poisson}(\lambda_i) \text{ independent} \implies \sum_{i=1}^n X_i \sim \text{Poisson}\left(\sum_{i=1}^n \lambda_i\right)$	Used in modeling cumulative event counts (e.g., defaults) over time.
Sum of Normals	$X_i \sim N(\mu_i, \sigma_i^2) \text{ independent} \implies \sum_{i=1}^n X_i \sim N\left(\sum_{i=1}^n \mu_i, \sum_{i=1}^n \sigma_i^2\right)$	Fundamental for portfolio theory and risk aggregation.

III. Fundamental Theorems and Inequalities

These theorems provide the theoretical justification for many statistical and financial models, particularly those involving large samples or long time horizons.

Central Limit Theorem (CLT)

Let $X_1, X_2, \dots, X_n$ be a sequence of i.i.d. random variables with mean $\mu$ and finite variance $\sigma^2$ . As $n \to \infty$ , the distribution of the standardized sample mean approaches the standard normal distribution:

Z_n = \frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0, 1)

Relevance: Justifies the use of the Normal distribution to model asset returns, as returns are the sum of many small, independent price changes. It also underpins statistical inference (e.g., confidence intervals, hypothesis testing).

Law of Large Numbers (LLN)

The LLN states that as the number of trials increases, the average of the results obtained from a large number of independent and identically distributed random variables converges to the expected value.

\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{p} \mu \quad \text{(Weak LLN)}

Relevance: Guarantees that Monte Carlo simulations will converge to the true expected value as the number of simulations increases.

Markov's and Chebyshev's Inequalities

These inequalities provide bounds on the probability that a random variable deviates from its mean, even when the full distribution is unknown.

IV. Quant Finance Specific Tools

These formulas are indispensable for derivative pricing and continuous-time modeling.

Ito's Lemma

Ito's Lemma is the fundamental rule of differentiation for stochastic processes, particularly those involving Brownian motion (Wiener process). It is the stochastic equivalent of the chain rule in standard calculus.

For a function $G(t, X_t)$ where $X_t$ follows the Ito process $dX_t = \mu(X_t, t) dt + \sigma(X_t, t) dW_t$ , the differential $dG$ is:

dG = \left( \frac{\partial G}{\partial t} + \mu \frac{\partial G}{\partial X} + \frac{1}{2} \sigma^2 \frac{\partial^2 G}{\partial X^2} \right) dt + \sigma \frac{\partial G}{\partial X} dW_t

Relevance: Used to derive the Black-Scholes Partial Differential Equation (PDE) and to find the process followed by a function of an asset price (e.g., the log-price).

Geometric Brownian Motion (GBM)

GBM is the most common model for asset prices $S_t$ in continuous time, assuming log-returns are normally distributed.

dS_t = \mu S_t dt + \sigma S_t dW_t

$\mu$ : Drift (expected return)
$\sigma$ : Volatility
$dW_t$ : Wiener process (Brownian motion)

The solution for $S_t$ is Lognormal: $S_t = S_0 \exp\left( \left(\mu - \frac{1}{2}\sigma^2\right) t + \sigma W_t \right)$ .

Black-Scholes-Merton (BSM) Formula (European Call Option)

The BSM formula provides a closed-form solution for the price of a European call option $C$ :

C(S, t) = S N(d_1) - K e^{-r(T-t)} N(d_2)

where:

d_1 = \frac{\ln(S/K) + (r + \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}}

d_2 = d_1 - \sigma \sqrt{T-t}

$S$ : Current stock price
$K$ : Strike price
$r$ : Risk-free interest rate
$T-t$ : Time to maturity
$\sigma$ : Volatility of the stock return
$N(\cdot)$ : Cumulative distribution function of the standard normal distribution

Risk-Neutral Valuation

The First Fundamental Theorem of Asset Pricing states that in a market with no arbitrage, there exists at least one risk-neutral measure $\mathbb{Q}$ under which the price of any derivative $V$ is the discounted expected value of its payoff, $V_T$ , under this measure.

V_t = e^{-r(T-t)} \mathbb{E}^{\mathbb{Q}}[V_T]

Relevance: This is the core principle of modern derivative pricing. The BSM formula is derived by applying this principle to the GBM process under the risk-neutral measure. The key change is that the drift $\mu$ of the asset price process is replaced by the risk-free rate $r$ .