Preparing for your next Quant Interview?
Practice Here!
OpenQuant
Section 1 of 6
Probability & StatisticsFormulas and Theorems

Essential Formulas and Theorems

A deep understanding of core statistical principles is crucial for modeling financial markets, pricing derivatives, and managing risk.

I. Core Probability Laws

These laws govern how probabilities are calculated and updated, forming the basis for statistical inference and decision-making under uncertainty.

Conditional Probability, Bayes' Theorem, and Law of Total Probability

Consider events A1,,AnA_1, \dots, A_n which form a partition of the sample space (i.e., they are mutually exclusive and collectively exhaustive) and an event BB.

ConceptFormulaDescription
Conditional ProbabilityP(AB)=P(AB)P(B)\mathbb{P}(A \mid B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}The probability of event AA occurring given that event BB has already occurred.
Law of Total ProbabilityP(B)=i=1nP(BAi)=i=1nP(BAi)P(Ai)\mathbb{P}(B) = \sum_{i=1}^n \mathbb{P}(B \cap A_i) = \sum_{i=1}^n \mathbb{P}(B \mid A_i)\mathbb{P}(A_i)Used to find the marginal probability of an event BB when the sample space is partitioned.
Bayes' TheoremP(A1B)=P(BA1)P(A1)P(B)\mathbb{P}(A_1 \mid B) = \frac{\mathbb{P}(B \mid A_1)\mathbb{P}(A_1)}{\mathbb{P}(B)}Relates the posterior probability P(A1B)\mathbb{P}(A_1 \mid B) to the prior P(A1)\mathbb{P}(A_1) and the likelihood P(BA1)\mathbb{P}(B \mid A_1). Relevance: Crucial for updating beliefs as new data arrives.

II. Moments and Relationships

Moments describe the shape and location of a probability distribution. Understanding their properties is key to manipulating random variables in models.

Law of the Unconscious Statistician (LOTUS)

The expected value of a function of a random variable g(X)g(X) can be calculated without first finding the distribution of Y=g(X)Y=g(X).

E[g(X)]=continuous XRg(x)fX(x)dx=discrete XkSupp(X)g(k)P(X=k)\mathbb{E}[g(X)] \stackrel{\text{continuous } X}{=} \int_{\mathbb{R}} g(x) f_X(x) dx \stackrel{\text{discrete } X}{=} \sum_{k \in \text{Supp}(X)} g(k) \mathbb{P}(X = k)

Law of Total Expectation and Variance

These laws are essential for models where one random variable depends on another (e.g., a two-stage process or a mixture model).

ConceptFormulaDescription
Total ExpectationE[X]=E[E[XY]]\mathbb{E}[X] = \mathbb{E}[\mathbb{E}[X \mid Y]]The overall expected value of XX is the expected value of the conditional expectation of XX given YY.
Total VarianceVar(X)=Var(E[XY])+E[Var(XY)]\mathrm{Var}(X) = \mathrm{Var}(\mathbb{E}[X \mid Y]) + \mathbb{E}[\mathrm{Var}(X \mid Y)]The total variance is the sum of the variance of the conditional mean (between-group variance) and the mean of the conditional variance (within-group variance).

Intuitively, the Law of Total Expectation says that if we "average over all averages" of XX obtained by some information about YY, we obtain the true average. Similarly, the Law of Total Variance says that the true variance comes from two sources: between samples (the first term) and within samples (the second term).

Covariance and Correlation

These measure the linear relationship between two random variables XX and YY.

Cov(X,Y)=E[(XE[X])(YE[Y])]=E[XY]E[X]E[Y]\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y]
Corr(X,Y)=ρX,Y=Cov(X,Y)σXσYwhere 1ρX,Y1\text{Corr}(X, Y) = \rho_{X,Y} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \quad \text{where } -1 \le \rho_{X,Y} \le 1

Key Properties of Variance and Covariance:

  1. Var(aX+b)=a2Var(X)\text{Var}(aX + b) = a^2\text{Var}(X)
  2. Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X, Y)
  3. If XX and YY are independent, Cov(X,Y)=0\text{Cov}(X, Y) = 0, and Var(X+Y)=Var(X)+Var(Y)\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y). Note: The converse is not always true (uncorrelated does not imply independent).

Common Relationships Between Distributions

RelationshipFormulaRelevance
Sum of BernoullisX1,,XnBernoulli(p) IID    i=1nXiBinom(n,p)X_1, \dots, X_n \sim \text{Bernoulli}(p) \text{ IID} \implies \sum_{i=1}^n X_i \sim \text{Binom}(n, p)Foundation of the Binomial Option Pricing Model.
Sum of PoissonsXiPoisson(λi) independent    i=1nXiPoisson(i=1nλi)X_i \sim \text{Poisson}(\lambda_i) \text{ independent} \implies \sum_{i=1}^n X_i \sim \text{Poisson}\left(\sum_{i=1}^n \lambda_i\right)Used in modeling cumulative event counts (e.g., defaults) over time.
Sum of NormalsXiN(μi,σi2) independent    i=1nXiN(i=1nμi,i=1nσi2)X_i \sim N(\mu_i, \sigma_i^2) \text{ independent} \implies \sum_{i=1}^n X_i \sim N\left(\sum_{i=1}^n \mu_i, \sum_{i=1}^n \sigma_i^2\right)Fundamental for portfolio theory and risk aggregation.

III. Fundamental Theorems and Inequalities

These theorems provide the theoretical justification for many statistical and financial models, particularly those involving large samples or long time horizons.

Central Limit Theorem (CLT)

Let X1,X2,,XnX_1, X_2, \dots, X_n be a sequence of i.i.d. random variables with mean μ\mu and finite variance σ2\sigma^2. As nn \to \infty, the distribution of the standardized sample mean approaches the standard normal distribution:

Zn=Xˉnμσ/ndN(0,1)Z_n = \frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0, 1)

Relevance: Justifies the use of the Normal distribution to model asset returns, as returns are the sum of many small, independent price changes. It also underpins statistical inference (e.g., confidence intervals, hypothesis testing).

Law of Large Numbers (LLN)

The LLN states that as the number of trials increases, the average of the results obtained from a large number of independent and identically distributed random variables converges to the expected value.

Xˉn=1ni=1nXipμ(Weak LLN)\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{p} \mu \quad \text{(Weak LLN)}

Relevance: Guarantees that Monte Carlo simulations will converge to the true expected value as the number of simulations increases.

Markov's and Chebyshev's Inequalities

These inequalities provide bounds on the probability that a random variable deviates from its mean, even when the full distribution is unknown.

IV. Quant Finance Specific Tools

These formulas are indispensable for derivative pricing and continuous-time modeling.

Ito's Lemma

Ito's Lemma is the fundamental rule of differentiation for stochastic processes, particularly those involving Brownian motion (Wiener process). It is the stochastic equivalent of the chain rule in standard calculus.

For a function G(t,Xt)G(t, X_t) where XtX_t follows the Ito process dXt=μ(Xt,t)dt+σ(Xt,t)dWtdX_t = \mu(X_t, t) dt + \sigma(X_t, t) dW_t, the differential dGdG is:

dG=(Gt+μGX+12σ22GX2)dt+σGXdWtdG = \left( \frac{\partial G}{\partial t} + \mu \frac{\partial G}{\partial X} + \frac{1}{2} \sigma^2 \frac{\partial^2 G}{\partial X^2} \right) dt + \sigma \frac{\partial G}{\partial X} dW_t

Relevance: Used to derive the Black-Scholes Partial Differential Equation (PDE) and to find the process followed by a function of an asset price (e.g., the log-price).

Geometric Brownian Motion (GBM)

GBM is the most common model for asset prices StS_t in continuous time, assuming log-returns are normally distributed.

dSt=μStdt+σStdWtdS_t = \mu S_t dt + \sigma S_t dW_t
  • μ\mu: Drift (expected return)
  • σ\sigma: Volatility
  • dWtdW_t: Wiener process (Brownian motion)

The solution for StS_t is Lognormal: St=S0exp((μ12σ2)t+σWt)S_t = S_0 \exp\left( \left(\mu - \frac{1}{2}\sigma^2\right) t + \sigma W_t \right).

Black-Scholes-Merton (BSM) Formula (European Call Option)

The BSM formula provides a closed-form solution for the price of a European call option CC:

C(S,t)=SN(d1)Ker(Tt)N(d2)C(S, t) = S N(d_1) - K e^{-r(T-t)} N(d_2)

where:

d1=ln(S/K)+(r+σ2/2)(Tt)σTtd_1 = \frac{\ln(S/K) + (r + \sigma^2/2)(T-t)}{\sigma \sqrt{T-t}}
d2=d1σTtd_2 = d_1 - \sigma \sqrt{T-t}
  • SS: Current stock price
  • KK: Strike price
  • rr: Risk-free interest rate
  • TtT-t: Time to maturity
  • σ\sigma: Volatility of the stock return
  • N()N(\cdot): Cumulative distribution function of the standard normal distribution

Risk-Neutral Valuation

The First Fundamental Theorem of Asset Pricing states that in a market with no arbitrage, there exists at least one risk-neutral measure Q\mathbb{Q} under which the price of any derivative VV is the discounted expected value of its payoff, VTV_T, under this measure.

Vt=er(Tt)EQ[VT]V_t = e^{-r(T-t)} \mathbb{E}^{\mathbb{Q}}[V_T]

Relevance: This is the core principle of modern derivative pricing. The BSM formula is derived by applying this principle to the GBM process under the risk-neutral measure. The key change is that the drift μ\mu of the asset price process is replaced by the risk-free rate rr.

Probability & Statistics

Quantitative Researcher
Quantitative Trader
Table of Contents