Lesson 5.9: The Workhorse of Modern Finance: The GARCH Model

We now upgrade the revolutionary ARCH model to its final, most powerful form: the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model. By adding a 'memory' of its own past forecasts, GARCH provides a more parsimonious and accurate way to model volatility clustering, making it the undisputed industry standard for financial risk management.

Part 1: The Limitation of ARCH and the GARCH Insight

In the previous lesson, we learned that the ARCH model explains today's variance as a function of past squared shocks ( $\epsilon_{t-q}^2$ ). While groundbreaking, practitioners quickly found a problem: financial volatility is very persistent. The effect of a shock can last for a long time.

To capture this long memory, an ARCH model would need a very large number of lags ( $q$ ), making it unwieldy and non-parsimonious. For example, you might need an ARCH(22) model to adequately describe daily stock returns.

The Core Insight: Adding an Autoregressive Term to Variance

Tim Bollerslev (1986), a student of Robert Engle, proposed a brilliant and elegant solution. He asked:

"What if today's variance depends not just on yesterday's shock, but also on yesterday's variance itself?"

This adds a feedback loop, or an autoregressive (AR) term, to the variance equation. The "G" in GARCH stands for **Generalized**, reflecting that it is a more flexible and powerful version of the ARCH model.

An ARCH model is like an MA model for variance. A GARCH model is like an ARMA model for variance. By adding the AR term, it can capture persistent dynamics with far fewer parameters.

Part 2: The GARCH(p,q) Model Specification

A GARCH model adds $p$ autoregressive lags of the conditional variance itself to the ARCH(q) specification.

The GARCH(p,q) Model Specification

As before, the model has two equations. The mean equation can be any appropriate model for the returns themselves. The innovation comes from the conditional variance equation:

\sigma_t^2 = \alpha_0 + \sum_{i=1}^q \alpha_i \epsilon_{t-i}^2 + \sum_{j=1}^p \beta_j \sigma_{t-j}^2

$\sigma_t^2$ : The **conditional variance** for period $t$ (our forecast).
$\alpha_0$ : The constant term.
$\sum_{i=1}^q \alpha_i \epsilon_{t-i}^2$ : The **ARCH component**. This is the "news" or "shock" term. It measures the reaction to the magnitude of past surprises.
$\sum_{j=1}^p \beta_j \sigma_{t-j}^2$ : The **GARCH component**. This is the "memory" or "persistence" term. It measures how much of yesterday's volatility level carries over to today.

The GARCH(1,1) Model: The Industry Standard

By far the most common and successful specification in practice is the GARCH(1,1) model:

\sigma_t^2 = \alpha_0 + \alpha_1 \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2

Interpretation:

Today's variance is a weighted average of three things: a long-run average variance (related to $\alpha_0$ ), the information about volatility from yesterday's shock ( $\alpha_1 \epsilon_{t-1}^2$ ), and yesterday's variance ( $\beta_1 \sigma_{t-1}^2$ ).
The coefficient $\beta_1$ typically has a large value (e.g., 0.8 to 0.98), indicating that volatility is highly persistent.
The coefficient $\alpha_1$ is typically small (e.g., 0.05 to 0.2), indicating that while new shocks are important, they don't completely overwrite the existing level of volatility.

The sum $\alpha_1 + \beta_1$ is called the **persistence parameter**. The closer this sum is to 1, the longer the memory of a shock to volatility.

Constraints for a well-behaved model:

To ensure positive variance and a stationary process, we require:

\alpha_0 > 0, \quad \alpha_1 \ge 0, \quad \beta_1 \ge 0, \quad \text{and} \quad \alpha_1 + \beta_1 < 1

Part 3: Forecasting with GARCH

One of the great strengths of GARCH is its simple and powerful forecasting mechanism. Once a GARCH(1,1) model is estimated, we can generate multi-step-ahead forecasts for the conditional variance.

GARCH(1,1) Forecasting Equations

The one-step-ahead forecast, made at time $t$ , is simply the GARCH equation itself, as all terms on the right are known at time $t$ :

\hat{\sigma}_{t+1|t}^2 = \hat{\alpha}_0 + \hat{\alpha}_1 \epsilon_{t}^2 + \hat{\beta}_1 \sigma_{t}^2

To forecast two steps ahead, we must forecast the terms we won't know at time $t+1$ . The best forecast for the squared shock $\epsilon_{t+1}^2$ is its conditional expectation, which is simply the conditional variance $\sigma_{t+1}^2$ .

\hat{\sigma}_{t+2|t}^2 = \hat{\alpha}_0 + \hat{\alpha}_1 E[\epsilon_{t+1}^2 | \mathcal{F}_t] + \hat{\beta}_1 \hat{\sigma}_{t+1|t}^2

\hat{\sigma}_{t+2|t}^2 = \hat{\alpha}_0 + (\hat{\alpha}_1 + \hat{\beta}_1) \hat{\sigma}_{t+1|t}^2

As the forecast horizon $h$ increases, the forecast will mean-revert towards the **unconditional (long-run) variance** of the process, which is given by:

\sigma^2 = \frac{\alpha_0}{1 - (\alpha_1 + \beta_1)}

Part 4: Python Implementation - A Full GARCH Workflow

GARCH(1,1) Modeling in Python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from arch import arch_model

# --- 1. Get and Prepare the Data ---
# Download S&P 500 data and calculate daily log returns
sp500 = yf.download('^GSPC', start='2010-01-01', end='2023-12-31')
returns = 100 * np.log(sp500['Adj Close']).diff().dropna()
returns.name = 'SP500_Returns'

# Plot the returns - notice the volatility clustering
returns.plot(title='S&P 500 Daily Returns (%)', figsize=(12,6))
plt.show()

# --- 2. Specify and Fit a GARCH(1,1) Model ---
# We assume a simple constant mean model for the returns.
# p=1 specifies the ARCH order, q=1 specifies the GARCH order.
garch_spec = arch_model(returns, vol='Garch', p=1, q=1)
garch_fit = garch_spec.fit(update_freq=5)

# Print the model summary
print(garch_fit.summary())

# --- 3. Analyze the Summary ---
# - Check p-values for alpha[1] and beta[1]. They should be highly significant.
# - Note the value of beta[1] - it will likely be > 0.85, showing high persistence.
# - Check the sum of alpha[1] + beta[1]. It should be < 1.

# --- 4. Plot the Conditional Volatility ---
# The model output contains the fitted conditional volatility.
# Note: GARCH gives variance, so we take the square root to get volatility (std dev).
cond_vol = garch_fit.conditional_volatility

fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(returns.index, returns, label='Daily Returns', alpha=0.7)
ax2 = ax.twinx()
ax2.plot(cond_vol.index, cond_vol, label='Conditional Volatility', color='red', linestyle='--')
ax.set_ylabel('Returns (%)')
ax2.set_ylabel('Volatility (Std. Dev.)')
ax.set_title('S&P 500 Returns and Fitted GARCH(1,1) Volatility')
fig.legend()
plt.show()
# The red line clearly shows the model capturing the high-volatility periods (e.g., 2020) and low-volatility periods.

# --- 5. Forecast Future Volatility ---
# Forecast volatility for the next 10 days
forecast = garch_fit.forecast(horizon=10)

# The output is variance, so we take the square root
forecasted_vol = np.sqrt(forecast.variance.iloc[-1])
print("\n10-Day Volatility Forecast (Annualized):")
print(forecasted_vol * np.sqrt(252)) # Annualize by sqrt(252 trading days)

Part 5: GARCH Zoo and Beyond

The basic GARCH(1,1) model is incredibly powerful, but it's not perfect. Its main limitation is the same as ARCH: it's symmetric. It doesn't account for the **leverage effect** (that negative news increases volatility more than positive news).

This limitation has led to a "GARCH zoo" of extensions designed to capture more complex dynamics:

EGARCH (Exponential GARCH): Models the log of the variance and explicitly includes a term to capture the leverage effect.
GJR-GARCH (Glosten-Jagannathan-Runkle GARCH): Adds a term that is "switched on" only for negative shocks, directly modeling the asymmetric response.
GARCH-in-Mean (GARCH-M): Allows the conditional variance to be a predictor in the mean equation, testing the finance theory that higher risk should be compensated with higher expected return.

What's Next? From One Series to Many

You have now mastered the undisputed king of univariate time series modeling in finance. With the ARIMA-GARCH framework, you can model and forecast both the conditional mean and the conditional variance of a single time series with incredible sophistication.

But markets are interconnected systems. The volatility of the S&P 500 is not independent of the volatility of the bond market. How do we model these interdependencies?

As we transition into our next module, **Advanced Quant Modeling**, we will expand our toolkit to handle multiple time series at once, starting with the **Vector Autoregression (VAR)** model.

Up Next: Let's Start Module 6: Vector Autoregression (VAR)