Lesson 1.5: The Center and The Spread: Expected Value & Variance

Now that we can describe a distribution with a PMF, how do we summarize it? This lesson introduces the two most vital statistics: the Expected Value (E[X]), which finds the 'center of mass' of the distribution, and the Variance (Var(X)), which measures its 'spread' or risk. These two numbers are the foundation of financial risk analysis and machine learning loss functions.

Part 1: The Expected Value (E[X]) - The Center of Gravity

The Core Idea (Analogy): Imagine your PMF is a set of weights placed on a seesaw at different numerical positions. The Expected Value is the single point where you could place the fulcrum to make the seesaw perfectly balance. It's the distribution's center of gravity.

Imagine a seesaw with weights of 0.25kg at x=0, 0.5kg at x=1, and 0.25kg at x=2. The balancing point is at x=1.

The Expected Value is the long-run average of an experiment if you were to repeat it infinitely many times. It's a weighted average of all possible outcomes, where each outcome is weighted by its probability.

Calculating Expected Value
To find the balancing point, we sum each outcome multiplied by its probability (its "mass").

Definition: Expected Value

E[X]=μX=all xxP(X=x)=all xxpX(x)E[X] = \mu_X = \sum_{\text{all } x} x \cdot P(X=x) = \sum_{\text{all } x} x \cdot p_X(x)

Example: A Simple Trading Game

A one-day trade has the following potential outcomes:

  • 50% chance of a +$100 profit (pX(100)=0.5p_X(100)=0.5)
  • 30% chance of a -$50 loss (pX(50)=0.3p_X(-50)=0.3)
  • 20% chance of breaking even, $0 (pX(0)=0.2p_X(0)=0.2)

What is the expected profit of this trade?

E[X]=(1000.5)+(500.3)+(00.2)E[X] = (100 \cdot 0.5) + (-50 \cdot 0.3) + (0 \cdot 0.2)
E[X]=5015+0=$35E[X] = 50 - 15 + 0 = \$35

Even though you never actually make exactly $35 on any single trade, if you made this trade thousands of times, your average profit per trade would converge to $35.

Part 2: Variance and Standard Deviation - The Measure of Risk

Expected value is not enough. Consider two investment strategies:

  • Strategy A: Guarantees a return of exactly +5%. (E[A] = 5%)
  • Strategy B: Has a 50% chance of +30% and a 50% chance of -20%. (E[B] = 0.5(30)+0.5(20)=1510=5%0.5(30) + 0.5(-20) = 15 - 10 = 5\%)

They have the same expected value, but Strategy B is far riskier. We need a number to quantify this "spread" or "risk." This is Variance.

The Core Idea: Variance measures the *expected squared deviation* from the mean. In simple terms, it's the average of how far away each outcome is from the center, after being squared to remove negative signs.

Calculating Variance (σ2\sigma^2) and Standard Deviation (σ\sigma)
We calculate the weighted average of the squared distances from the mean.

Definition of Variance

Var(X)=σ2=E[(Xμ)2]=all x(xμ)2pX(x)\text{Var}(X) = \sigma^2 = E[(X - \mu)^2] = \sum_{\text{all } x} (x - \mu)^2 \cdot p_X(x)

There is a computationally simpler formula, often called the "shortcut formula":

Shortcut Formula for Variance

Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2

The Problem with Variance: Units are Squared!

If our trade is in dollars, the variance is in "dollars-squared," which is hard to interpret. We solve this by taking the square root.

Standard Deviation

The Standard Deviation is simply the square root of the variance. It returns our measure of spread to the original units.

SD(X)=σ=Var(X)\text{SD}(X) = \sigma = \sqrt{\text{Var}(X)}

Part 3: The Payoff: The Language of Modern Finance

Quant Finance: The Mean-Variance Framework

This entire lesson is the bedrock of Modern Portfolio Theory, developed by Harry Markowitz (and for which he won a Nobel Prize). The entire theory is built on two numbers:

  • Expected Return (The "Good"): This is simply the E[R]E[R] of an asset's returns. Investors want to maximize this.
  • Volatility (The "Bad"): This is the Standard Deviation of an asset's returns, σR\sigma_R. It is the universal measure of an asset's risk. Investors want to minimize this.

Every quant, portfolio manager, and risk analyst in the world speaks the language of mean and variance. It is the fundamental trade-off of investing: how much expected return are you willing to take on for a given amount of risk (volatility)?

Machine Learning: Loss Functions and Regularization

Expected value and variance are also central to how we train and evaluate models.

  • Mean Squared Error (MSE): The most common loss function for regression is literally the average of squared errors. It's an empirical estimate of the variance of the model's prediction errors. Minimizing MSE means we are trying to build a model with low error variance.
  • Bias-Variance Tradeoff: A core concept in ML (which we'll cover in Module 3) is the tradeoff between a model's bias (how wrong its average prediction is) and its variance (how much its predictions fluctuate with different training data). Understanding variance is key to diagnosing whether a model is "overfitting."
Summary: Expected Value vs. Variance
    • Expected Value (E[X]E[X]): The center of the distribution. A measure of central tendency or long-run average.
    • Variance (Var(X)\text{Var}(X)): The spread of the distribution. A measure of dispersion, risk, or uncertainty.
    • Standard Deviation (σ\sigma): The square root of variance. It's the most common measure of risk because its units are easy to interpret.