All Topics

The dot product as a measure of 'projection' or 'agreement.' L1 and L2 norms as measures of length/magnitude. Cosine similarity as a practical application.

Lesson 1.4: Orthogonality

The concept of perpendicular vectors (dot product = 0) and its meaning: independence.

Lesson 1.5: The Two Views of a Matrix

A matrix as a container for data (a collection of vectors) vs. a matrix as a linear transformation that moves, rotates, and scales space.

Lesson 1.6: Matrix Operations

Addition, scalar multiplication, and the transpose.

Lesson 1.7: Matrix Multiplication

Taught not just as a rule, but as the composition of linear transformations. This explains why AB ≠ BA.

Lesson 1.8: Special Matrices

Identity matrix (the 'do nothing' operation), inverse matrix (the 'undo' operation), diagonal, triangular, and symmetric matrices.

Lesson 1.9: Linear Combinations and Span

What can you build with a set of vectors?

Lesson 1.10: Linear Independence

Identifying and removing redundant vectors.

Lesson 1.11: Basis and Dimension

The minimal set of vectors needed to define a space and the concept of its dimension.

Lesson 1.12: Vector Spaces and Subspaces

Formalizing these concepts. A subspace as a 'plane' or 'line' within a higher-dimensional space that passes through the origin.

Lesson 2.1: Framing the Problem: Ax=b

Understanding Ax=b from the row picture (intersection of planes) and the column picture (linear combination of columns).

Lesson 2.2: Gaussian Elimination

The core algorithm for solving linear systems. Row operations, row echelon form (REF).

Lesson 2.3: The Solutions to Ax=b

Identifying if a system has a unique solution, no solution, or infinitely many solutions from its REF.

Lesson 2.4: Reduced Row Echelon Form (RREF)

The ultimate, unique 'answer sheet' for a linear system, removing the need for back-substitution.

Lesson 2.5: LU Decomposition

The 'matrix version' of Gaussian Elimination. Solving Ax=b becomes a fast, two-step process of forward and back substitution.

Lesson 2.6: Column Space & Rank

The space of all possible outputs of A. The concept of rank as the "true dimension" of the output space.

Lesson 2.7: The Null Space

The space of all inputs that map to the zero vector. Its connection to multicollinearity in data.

Lesson 2.8: Row Space & Left Null Space

Completing the picture of the four fundamental subspaces.

Lesson 2.9: The Fundamental Theorem of Linear Algebra

How the four subspaces relate to each other and partition the input and output spaces.

Lesson 3.1: The Geometric Meaning of the Determinant

The determinant as the scaling factor of area/volume.

Lesson 3.2: Calculation and Properties

Cofactor expansion and the properties of determinants. A determinant of zero means the matrix squishes space into a lower dimension (i.e., it's not invertible).

Lesson 4.1: Eigenvalues & Eigenvectors

Finding the 'special' vectors that are only scaled by a transformation, not rotated off their span (Ax = λx).

Lesson 4.2: The Characteristic Equation

The calculation behind eigenvalues: solving det(A - λI) = 0.

Lesson 4.3: Diagonalization (PDP⁻¹)

Decomposing a matrix into its core components: 'changing to the eigenbasis, scaling, and changing back.'

Lesson 4.4: Applications of Eigen-analysis

Using eigenvalues for tasks like calculating matrix powers (e.g., for Markov chains).

Lesson 4.5: The Spectral Theorem

For symmetric matrices (like covariance matrices), the eigendecomposition is especially beautiful and stable (A = QDQᵀ). This is the theoretical foundation of PCA.

Lesson 4.6: The Cholesky Decomposition (LLᵀ)

A highly efficient specialization for symmetric, positive-definite matrices, often used in optimization and financial modeling.

Lesson 5.1: The Inexact Problem: Why Ax=b Often Has No Solution

Introducing the goal of minimizing the error ||Ax - b||.

Lesson 5.2: The Geometry of "Best Fit": Projections

Finding the closest point in a subspace (the Column Space) to an external vector.

Lesson 5.3: The Algebraic Solution: The Normal Equations

Deriving AᵀAx̂ = Aᵀb from the projection geometry. This is the engine of Linear Regression.

Lesson 5.4: The Problem with the Normal Equations

Understanding why AᵀA can be ill-conditioned and lead to numerical errors.

Lesson 5.5: The Stable Solution: The Gram-Schmidt Process

An algorithm for creating a "nice" orthonormal basis from any starting basis.

Lesson 5.6: The QR Decomposition

Using Gram-Schmidt to factor A=QR. Show how this makes solving the least squares problem trivial (R = Qᵀb) and numerically robust.

Lesson 6.1: The Singular Value Decomposition (SVD)

The ultimate decomposition (A = UΣVᵀ) that works for any matrix and finds orthonormal bases for all four fundamental subspaces simultaneously.

Lesson 6.2: Principal Component Analysis (PCA)

A direct, powerful application of SVD on the data matrix for dimensionality reduction.

Lesson 6.3: Advanced SVD Applications

Low-rank approximation for noise reduction, and the core ideas behind recommendation systems.

Lesson 7.1: The Covariance Matrix & Portfolio Risk

Deriving portfolio variance from first principles using linear algebra.

Lesson 7.2: Portfolio Optimization & The Efficient Frontier

Using linear algebra to construct optimal portfolios.

Lesson 7.3: The Capital Asset Pricing Model (CAPM)

Understanding the relationship between risk and expected return.

Lesson 7.4: Arbitrage & The Fundamental Theorem of Asset Pricing

Connecting "no free lunch" to the geometry of vector spaces.

Lesson 7.5: Markov Chains for State Transitions

Modeling dynamic systems like credit ratings with transition matrices.

Lesson 7.6: Fixed Income (Bond) Mathematics

Duration and convexity as linear algebraic concepts.

Hypothesis Testing Guide

A comprehensive guide to choosing the right statistical test.

An Introduction to Hypothesis Testing

A practical guide to deciding if your results are a real breakthrough or just random noise.

T-Test

Compares the means of two groups, assuming normal distribution.

Z-Test

Compares means of large samples (n>30) with known population variance.

ANOVA

Compares the averages of three or more groups.

F-Test

Compares the variances (spread) of two or more groups.

Pearson Correlation

Measures the linear relationship between two continuous variables.

Chi-Squared Test

Analyzes categorical data to find significant relationships.

Mann-Whitney U Test

Alternative to the T-Test when data is not normally distributed.

Kruskal-Wallis Test

Alternative to ANOVA for comparing three or more groups.

Wilcoxon Signed-Rank Test

Alternative to the paired T-Test for repeated measurements.

Spearman's Rank Correlation

Measures the monotonic relationship between two ranked variables.

Friedman Test

The non-parametric alternative to a repeated-measures ANOVA.

Kolmogorov-Smirnov (K-S) Test

Tests if a sample is drawn from a specific distribution.

Hypothesis Testing & P-Values

The detective work of data science.

Descriptive Statistics Explorer

Interactive guide to mean, median, skewness, and kurtosis.

The Central Limit Theorem (CLT)

Discover how order emerges from chaos.

Confidence Intervals

Understanding the range where a true value likely lies.

Z-Table Calculator

Calculate probabilities from Z-scores and vice-versa.

The Normal Distribution

The ubiquitous "bell curve."

Monte Carlo Simulation

Using random simulation to solve complex problems.

Time Series Decomposition

Breaking down a time series into its core components.

Autocorrelation (ACF & PACF)

Measuring how a time series correlates with its past values.

Volatility & Standard Deviation (GARCH)

Modeling the changing volatility of financial returns.

Efficient Frontier & Sharpe Ratio

Finding the optimal portfolio for a given level of risk.

Kalman Filters

Dynamically estimating the state of a system from noisy data.

Stochastic Calculus & Ito's Lemma

The calculus of random walks, essential for derivatives pricing.

Lesson 1.0: The Language of Possibility: Sets, Sample Spaces, and Events

Understanding the building blocks of probability.

Lesson 1.1: The Rules of the Game: Kolmogorov's Axioms

The three fundamental rules that govern all of probability.

Lesson 1.2: Updating Beliefs with Conditional Probability

How the occurrence of one event affects another.

Lesson 1.3: Law of Total Probability and Bayes' Theorem

Updating your beliefs in the face of new evidence.

Lesson 1.4: From Outcomes to Numbers: PMF & CDF

Describing the probabilities of discrete outcomes.

Lesson 1.5: The Center and The Spread: Expected Value & Variance

Calculating the center and spread of a random variable.

Lesson 1.6: The Quant's Toolbox: Common Discrete Distributions

Exploring key models for random events.

Lesson 1.7: The Master Tool: Moment Generating Functions (MGFs)

The "fingerprint" of a distribution for deriving moments.

Lesson 1.8: The Continuous World: PDFs and Smooth CDFs

Describing the probabilities of continuous outcomes.

Lesson 1.9: The Calculus of Center and Spread

Calculating moments for continuous random variables.

Lesson 1.10: The Continuous Toolbox

Exploring Uniform, Exponential, and Gamma distributions.

Lesson 1.11: Masterclass on Continuous MGFs

Deriving moments for Normal, Exponential, and Gamma distributions.

Lesson 1.12: Thinking in Multiple Dimensions: Joint Distributions

Modeling multiple random variables simultaneously.

Lesson 1.13: Slicing the Probability Landscape

Extracting marginal and conditional probabilities from joint distributions.

Lesson 1.14: Measuring Relationships: Covariance & Correlation

Measuring how two random variables move together.

Lesson 1.15: The Ultimate Separation: Statistical Independence

Defining when two variables have no influence on each other.

Lesson 2.0: The King of Distributions: The Normal Curve

Mastering the bell curve and standardization.

Lesson 2.1: The Superpower of the Normal Distribution

Understanding how normal variables combine.

Lesson 2.2: The Multivariate Normal Distribution (MVN)

The cornerstone of modern portfolio theory.

Lesson 2.3: Performing Surgery on the MVN

Dissecting multi-asset models.

Lesson 2.4: Capstone 1: The MVN in Action: Portfolio Theory

Applying MVN properties to portfolio construction and risk management.

Lesson 2.5: The χ² (Chi-Squared) Distribution

The distribution of variances.

Lesson 2.6: The t-Distribution (Student's t)

The backbone of hypothesis testing with small sample sizes.

Lesson 2.7: The F-Distribution (Fisher-Snedecor)

Comparing variances between groups and the foundation of ANOVA.

Lesson 2.8: The Law of the Average: The WLLN

Why casino averages are so stable.

Lesson 2.9: The Universal Bell Curve: The Central Limit Theorem (CLT)

Why the normal distribution is everywhere.

Lesson 2.10: Capstone 2: The CLT in Action (Python Simulation)

An interactive simulation to visualize the CLT with different distributions.

Lesson 2.11: Advanced Asymptotics: Slutsky's Theorem & The Delta Method

Tools for approximating the distribution of functions of random variables.

Lesson 3.0: The Language of Inference: Parameter, Statistic, Estimator, Estimate

Distinguishing between a function of data and a guess for a parameter.

Lesson 3.1: Judging Estimators: The Property of Unbiasedness

Evaluating the accuracy of estimators.

Lesson 3.2: Efficiency and the Cramér-Rao Lower Bound (CRLB)

Finding the "best" possible unbiased estimator.

Lesson 3.3: Consistency and Sufficiency

Ensuring estimators converge to the true value and use all available information.

Lesson 3.4: Method of Moments (MoM) Estimation

An intuitive technique for finding estimators by matching sample moments to population moments.

Lesson 3.5: The Master Recipe: Maximum Likelihood Estimation (MLE)

The most important method for parameter estimation in finance.

Lesson 3.6: Finding MLE Estimates via Optimization

The practical side of implementing MLE.

Lesson 3.7: General Construction of Confidence Intervals (CIs)

A framework for creating intervals for any parameter.

Lesson 3.8: Applying the Recipe: CIs for Mean and Variance

Using t, χ², and Z pivotal quantities to build intervals.

Lesson 3.9: The Logic of Statistical Decisions

The fundamental setup of all hypothesis tests.

Lesson 3.10: The Verdict: p-values and Critical Regions

The two equivalent approaches to making a statistical decision.

Lesson 3.11: The Theory of the 'Best' Test: The Neyman-Pearson Lemma

Finding the most powerful test for a given significance level.

Lesson 3.12: The Generalized LRT and Wilks' Theorem

A general method for comparing nested models.

Lesson 4.0: The Quest for the 'Best' Line: Simple Linear Regression (SLR)

Modeling a relationship with a single predictor.

Lesson 4.1: The Full OLS Derivation (SLR)

The calculus behind finding the "best fit" line.

Lesson 4.2: The Performance Review: R-Squared and Residuals

Assessing how well your linear model fits the data.

Lesson 4.3: Upgrading to a 3D World: Multiple Linear Regression (MLR)

Extending SLR to multiple predictors using linear algebra.

Lesson 4.4: The 'Master Formula' Derivation (MLR)

The matrix algebra for solving a multiple regression problem.

Lesson 4.5: The Classical Linear Model (CLM) Assumptions

Defining the rules for OLS to be BLUE.

Lesson 4.6: The Gauss-Markov Theorem and the BLUE Property

The theoretical justification for using OLS.

Lesson 4.7: t-Tests for Individual Coefficients

Testing the significance of a single predictor.

Lesson 4.8: F-Tests for Joint Hypotheses

Testing the significance of a group of predictors or the entire model.

Lesson 4.9: Multicollinearity and the VIF

Diagnosing when predictors are too correlated with each other.

Lesson 4.10: Heteroskedasticity: Detection and Correction

Handling non-constant variance in the error terms.

Lesson 4.11: Autocorrelation: Detection and Consequences

Detecting patterns in the error terms over time.

Lesson 4.12: Capstone: Building and Testing a Fama-French Factor Model

A practical application of MLR to test a famous financial model.

Lesson 5.0: Introduction to Time Series: The Language of Dynamics

Decomposing the components of a time series (Trend, Seasonality, Cycles, and Noise).

Lesson 5.1: The Bedrock of Time Series: Stationarity

The most important property for modeling time series data.

Lesson 5.2: The Detective's Tools: ACF and PACF

The key tools for identifying the structure of a time series.

Lesson 5.3: The Building Block of Memory: The Autoregressive (AR) Model

Modeling how past values influence the present.

Lesson 5.4: The Building Block of Shocks: The Moving Average (MA) Model

Modeling how past forecast errors influence the present.

Lesson 5.5: Combining Memory and Shocks: The ARMA Model

Combining AR and MA models to capture complex dynamics.

Lesson 5.6: Taming Wild Data: Unit Roots and the ARIMA Model

Incorporating differencing to model real-world data like stock prices.

Lesson 5.7: The Practitioner's Guide: The Box-Jenkins Methodology

The systematic process for identifying, estimating, and validating ARIMA models.

Lesson 5.8: Modeling Volatility: The ARCH Model

Introducing models where variance depends on past errors.

Lesson 5.9: The Workhorse of Modern Finance: The GARCH Model

The industry-standard model for volatility forecasting.

Lesson 5.10: Capstone: Building a GARCH Model to Forecast Stock Market Volatility

A real-world project to model and forecast the volatility of a major stock index.

Lesson 6.0: The Foundation: Random Walks and the Efficient Market Hypothesis

Exploring the theory that market prices are unpredictable.

Lesson 6.1: A Deeper Look: Martingales and Predictability

The formal mathematical definition of a "fair game" and its implications for financial markets.

Lesson 6.2: The Language of Options Pricing: Geometric Brownian Motion (GBM)

The standard model for stock price paths used in the Black-Scholes formula.

Lesson 6.3: The Brute-Force Solution: Monte Carlo Simulation for Pricing and Risk

Using simulation to solve problems that are too hard for pure math.

Lesson 6.4: Resampling Reality: Bootstrapping for Estimating Standard Errors

A powerful computational method for assessing the uncertainty of an estimate when theory fails.

Lesson 6.5: A Deeper Dive into Resampling: The Jackknife

A related resampling technique for bias and variance estimation.

Lesson 6.6: From One Series to Many: Vector Autoregression (VAR) Models

Modeling the dynamics of multiple time series at once.

Lesson 6.7: Finding Long-Run Relationships: Cointegration

A statistical test for finding stable, long-term relationships between non-stationary time series (the basis of pairs trading).

Lesson 6.8: Putting It Together: The Vector Error Correction Model (VECM)

A model that combines long-run equilibrium (cointegration) with short-run dynamics (VAR).

Lesson 6.9: Capstone: Building a Pairs Trading Strategy with Cointegration

A complete, real-world project to find a cointegrated pair of stocks and build a basic trading strategy.

Bayes' Theorem

A framework for updating beliefs with new evidence.

Bernoulli Distribution

Modeling a single trial with two outcomes.

Binomial Distribution

Modeling a series of success/fail trials.

Poisson Distribution

Modeling the frequency of rare events.

Geometric Distribution

Modeling trials until the first success.

Hypergeometric Distribution

Modeling sampling without replacement.

Negative Binomial Distribution

Modeling trials until a set number of successes.

Discrete Uniform Distribution

Modeling where all outcomes are equally likely.

Multinomial Distribution

Generalizing the Binomial for multiple outcomes.

Gamma Distribution

Modeling waiting times and skewed data.

Beta Distribution

Modeling probabilities, percentages, and proportions.

Exponential Distribution

Modeling the time between events in a Poisson process.

Cauchy Distribution

Modeling extreme events and 'fat-tailed' phenomena.

Laplace Distribution

Modeling with a sharp peak and 'fat tails'.

Weibull Distribution

Modeling time-to-failure and event durations.

Logistic Distribution

A key distribution in machine learning and growth modeling.

Lesson 1.1: The Basics: Sample Spaces & Events

Understanding the building blocks of probability.

Lesson 1.2: Combinatorics: The Art of Counting

Techniques for counting outcomes and possibilities.

Lesson 1.3: Conditional Probability & Independence

How the occurrence of one event affects another.

Lesson 1.4: Bayes' Theorem

Updating beliefs in the face of new evidence.

Lesson 2.1: Random Variables (Discrete & Continuous)

Mapping outcomes of a random process to numbers.

Lesson 2.2: Expectation, Variance & Moments

Calculating the center, spread, and shape of a distribution.

Lesson 2.3: Common Discrete Distributions

Exploring Bernoulli, Binomial, and Poisson distributions.

Lesson 3.1: Joint, Marginal & Conditional Distributions

Modeling the behavior of multiple random variables at once.

Lesson 3.2: Covariance & Correlation

Measuring how two random variables move together.

Lesson 3.3: The Law of Large Numbers (LLN)

Why casino averages are so stable.

Lesson 3.4: The Central Limit Theorem (CLT)

Why the normal distribution is everywhere.

Lesson 4.1: Transformations of Random Variables

Finding the distribution of a function of a random variable.

Lesson 4.2: Introduction to Information Theory

Quantifying information with Entropy and KL Divergence.

Lesson 5.1: Introduction to Stochastic Processes & Stationarity

Understanding random phenomena that evolve over time.

Lesson 5.2: Discrete-Time Markov Chains

Modeling memoryless state transitions.

Lesson 5.3: The Poisson Process

Modeling the timing of random events.

Lesson 5.4: Random Walks & Brownian Motion

The mathematical foundation of stock price movements.

Lesson 6.1: Sigma-Algebras & Probability Measures

The rigorous foundation of modern probability.

Lesson 6.2: The Lebesgue Integral & Rigorous Expectation

A more powerful theory of integration.

Lesson 6.3: Martingales

The formal model of a fair game.

Lesson 6.4: Introduction to Itô Calculus

The calculus of random walks, essential for derivatives pricing.

Lesson 1.0: The ML Landscape: Supervised, Unsupervised & Reinforcement Learning

The essential mindset and vocabulary for thinking like a data scientist.

Lesson 1.1: The Core Problem: The Bias-Variance Tradeoff

Understanding the fundamental challenge of Underfitting vs. Overfitting.

Lesson 1.2: The Golden Rule of ML: Train, Validate, Test

The art of splitting data to build robust models that generalize.

Lesson 1.3: The Data Preprocessing Toolkit: Why and How to Scale Your Features

The mandatory step of Standardization & Normalization before training most models.

Lesson 1.4: Your First Predictive Model (Intuition): K-Nearest Neighbors (KNN)

An intuitive look at a simple, powerful classification algorithm.

Lesson 1.5: Our First Continuous Model (Intuition): Simple Linear Regression

Understanding the basic concept of fitting a line to data.

Lesson 1.6: Our First Scoring System: Accuracy, Confusion Matrix, Precision, Recall, F1-Score

How to score a model when "accuracy" isn't enough.

Lesson 1.7: Measuring Regression Error: MSE and R-Squared

How to score the performance of a continuous predictive model.

Lesson 2.1: From Simple to Multiple Linear Regression: The Mathematics of Fitting a Plane

Mastering the most interpretable and fundamental predictive models.

Lesson 2.2: The Engine of Learning: Gradient Descent and Loss Function Optimization

The core algorithm that powers almost all modern machine learning.

Lesson 2.3: Taming Overfitting: The Math of Ridge (L2) and Lasso (L1)

How adding penalties to our loss function creates simpler, more robust models.

Lesson 2.4: The Geometry of Feature Selection: L1 vs. L2

A geometric explanation for why the L1 penalty can force coefficients to exactly zero.

Lesson 2.5: From Regression to Classification: The Logic of Logistic Regression

How to adapt linear models for binary outcomes using the Sigmoid function.

Lesson 2.6: Evaluating Classifiers: Precision-Recall vs. ROC/AUC

Advanced metrics for scoring classification models.

Lesson 2.7: The Assumptions of Linear Models

Understanding when linear models work and when they fail.

Lesson 2.8: Capstone Project: Building a Credit Default Predictor

A practical project to predict loan defaults using Logistic Regression.

The Intuition of a Decision Tree: How a Tree Learns by Asking Questions

Moving beyond straight lines to capture complex patterns in the data.

The Mathematics of a Split: Entropy and Gini Impurity

The formal criteria a tree uses to find the "best" question to ask.

The Achilles' Heel of Trees: Why Single Trees Overfit (High Variance)

Understanding why decision trees are powerful but unstable.

Taming the Tree: Pruning and Setting Hyperparameters (e.g., max_depth)

Techniques to control a tree's complexity and prevent overfitting.

Polynomial Regression: Capturing Non-Linearity within a Linear Framework

Using feature engineering to make linear models more flexible.

Introduction to Support Vector Machines (SVMs): Finding the Optimal Margin

The geometric intuition behind "maximum margin" classifiers.

The Kernel Trick: How SVMs Map Data to Higher Dimensions

A powerful mathematical "hack" that allows SVMs to find non-linear decision boundaries.

Comparing Models: When to Use a Linear Model vs. a Tree vs. an SVM

A practical guide to model selection based on data characteristics.

The Philosophy of Ensembles: Why a "Crowd" of Models is Wiser than One

Combining many simple models to create one powerful one. The king of tabular data.

Bagging (Bootstrap Aggregating): The Intuition Behind Random Forest

Reducing variance by averaging the results of many trees trained on different data subsets.

Random Forest: A Deep Dive - How Adding Randomness Reduces Variance

Exploring feature randomness and its role in decorrelating trees.

The Philosophy of Boosting: Learning from Mistakes Sequentially

Building a "chain" of models where each one corrects the errors of the last.

Gradient Boosting Machines (GBM): How Trees Predict the Errors of Previous Trees

The mechanics of using gradient descent to build an ensemble.

The Champion Model: XGBoost - Understanding the Innovations

A look at the regularization and speed improvements that made XGBoost dominant.

Comparing Bagging vs. Boosting: A Bias-Variance Perspective

Understanding when to use Random Forest versus a boosted model.

Capstone Project: Forecasting Volatility with Ensembles

A practical project to compare ensemble methods on a real financial task.

The Goal of Unsupervised Learning: Clustering vs. Dimensionality Reduction

Discovering hidden patterns and simplifying data without labels.

Clustering with K-Means: The Algorithm (Assign & Update Steps)

An intuitive dive into the most popular clustering algorithm.

Challenges with K-Means: The Initialization Problem and the K-Means++ Solution

Understanding the weaknesses of K-Means and how to mitigate them.

How Many Clusters? The Elbow Method and Silhouette Score

Techniques for choosing the optimal number of clusters for your data.

The Intuition of PCA (Principal Component Analysis): Finding the Directions of Maximum Variance

A geometric understanding of the most important dimensionality reduction technique.

The Mathematics of PCA: Eigenvectors, Eigenvalues, and Explained Variance

Connecting PCA directly to the linear algebra of eigendecomposition.

Applications of PCA in Finance: Creating Index Factors and Denoising Correlation Matrices

Practical uses of PCA for risk management and factor investing.

Other Clustering Methods: An Overview of DBSCAN and Hierarchical Clustering

Exploring alternatives to K-Means for more complex data structures.

The Language of Time Series: Stationarity, Autocorrelation (ACF), and Partial Autocorrelation (PACF)

Specialized models for data where the order of observations matters.

Testing for Stationarity: The Augmented Dickey-Fuller (ADF) Test

A formal statistical test for the most important property of time series data.

Classical Models I (The "ARIMA" Family): Autoregressive, Moving Average Models

A deep dive into the workhorses of classical time series forecasting.

Classical Models II (Volatility): The ARCH and GARCH Models for Volatility Forecasting

Modeling the "volatility of volatility" in financial markets.

Feature Engineering for Time Series: Lags, Rolling Windows, and Date-Based Features

How to create powerful predictors from time-ordered data.

Using ML for Time Series: How to Frame a Forecasting Problem for XGBoost

Adapting standard ML models like Gradient Boosting for sequence prediction.

The Perils of Backtesting: Look-Ahead Bias and Ensuring Walk-Forward Validation

The most common and dangerous mistakes made in time series validation.

Advanced Concept: Fractional Differentiation for Preserving Memory

A modern technique to achieve stationarity without destroying the data's memory.

The Neuron: From the Perceptron to Modern Activation Functions (ReLU, Sigmoid)

Building neural networks from the ground up, one neuron at a time.

Building a Brain: The Multi-Layer Perceptron (MLP) and the Concept of Layers

Stacking neurons into layers to learn complex, hierarchical patterns.

How Neural Networks Learn: The Intuition of Backpropagation and Chain Rule

A conceptual walkthrough of the algorithm that drives all of deep learning.

Optimizing the Brain: Understanding Optimizers (Adam, SGD) and Learning Rates

The tools we use to guide the learning process effectively.

Fighting Overfitting in Neural Networks: Dropout and Early Stopping

Essential techniques to ensure your neural network generalizes to new data.

The Universal Approximation Theorem: Why Neural Networks are so Powerful

The theory explaining why a neural network can, in principle, approximate any function.

Practical Session: Building Your First Neural Network in PyTorch/TensorFlow

A hands-on coding session to put the theory into practice.

Introduction to Convolutional Neural Networks (CNNs): For Image and Grid Data

A brief look at the specialized architecture for spatial data.

The Problem of Memory: Why MLPs Fail on Sequence Data

Applying neural networks to time series and text data.

Recurrent Neural Networks (RNNs): The Concept of a Hidden State

Introducing loops into the network to create a form of memory.

The Vanishing/Exploding Gradient Problem in RNNs

Understanding why simple RNNs struggle to learn long-term dependencies.

The Solution: Long Short-Term Memory (LSTM) & Gated Recurrent Units (GRU)

The architectures that use "gates" to solve the long-term memory problem.

The Attention Mechanism: A New Way to Think About Sequence Importance

Allowing the model to "pay attention" to the most relevant parts of an input sequence.

The Rise of the Transformer: The Architecture that Revolutionized NLP

The model behind modern marvels like ChatGPT and BERT.

Comparing Models: When to Use GARCH vs. XGBoost vs. LSTM for Forecasting

A practical framework for choosing the right tool for your forecasting problem.

Capstone Project: Forecasting Market Volatility using an LSTM

Applying a state-of-the-art deep learning model to a real financial task.

From Text to Numbers: Classic Techniques (Bag-of-Words, TF-IDF)

Extracting predictive signals from unstructured text data like news and reports.

Financial Sentiment Analysis: Building a Lexicon-Based Scorer

Classifying financial text as positive, negative, or neutral.

The Rise of Embeddings: The Intuition of Word2Vec

Representing words as dense vectors in a meaningful space.

The State of the Art: Contextual Embeddings with BERT

How transformers create word representations that understand context.

NLP Tasks for Finance: Named Entity Recognition (NER) and Topic Modeling (LDA)

Extracting specific information like company names or identifying key themes in a document.

Information Extraction: Parsing Financial Reports and News Feeds

Automating the process of turning documents into structured data.

Integrating NLP Signals: How to Add Text-Based Features to a Trading Model

Practical techniques for using sentiment scores as predictive features.

Capstone Project: Sentiment Analysis of Earnings Call Transcripts to Predict Stock Returns

An end-to-end project to build a sentiment-based trading signal.

Model Explainability & Interpretability: "Why did my model do that?" (SHAP & LIME)

Advanced, highly practical topics that define the cutting edge of quantitative ML.

Advanced Feature Engineering: De Prado's Meta-Labeling for Sizing Bets

A sophisticated technique to build a model that decides how much to bet, on top of a model that decides when to bet.

Advanced Backtesting: The Dangers of Data Snooping and Multiple Testing

Why so many backtests look good on paper but fail in live trading.

Reinforcement Learning for Trading: The Basics of Q-Learning and Policy Gradients

Training an AI agent to learn an optimal trading strategy through trial and error.

ML for Portfolio Optimization: Estimating Covariance Matrices with ML

Using machine learning to create more stable and predictive risk models.

Leveraging Alternative Data: An Overview of Satellite Imagery, GPS, and Web Scraped Data

Finding alpha in new, unstructured data sources.

AI Ethics & Regulation in Finance: Model Bias, Fairness, and Governance

The critical non-technical considerations for deploying ML models in a regulated industry.

Final Project: Design and Propose a Complete, Novel ML-based Trading Strategy

A capstone project to synthesize all learned concepts into a professional-grade strategy proposal.

Lesson 0.1: Key Concepts - Mean and Variance

Mean (μ) as the "expected value" or "center" of a distribution. Variance (σ²) as the measure of "spread" or "messiness" of all possible outcomes.

Lesson 0.2: Standard Deviation (The "Intuitive" Spread)

Standard Deviation (σ = √σ²) as the "fix" for variance. Why we use σ instead of σ².

Lesson 0.3: The Normal Distribution N(μ, σ²)

The "bell curve" as the mathematical "recipe" for pure randomness. How μ and σ² affect its shape.

Lesson 0.4: Calculus Review - Derivatives (The "Slope")

A review of df/dt as the "instantaneous rate of change" and an introduction to partial derivatives.

Lesson 0.5: Calculus Review - Integrals (The "Sum")

Reviewing ∫f(x)dx as "summing up an infinite number of tiny pieces" to find a total area or total change.

Lesson 0.6: The "Master Tool": The Taylor Expansion

Approximating any complex, curved function using a simple polynomial (line or parabola). This is our "prediction" tool.

Lesson 0.7: The "Master Tool" (Part 2): The 2-Variable Taylor Expansion

Upgrading our approximation tool for functions of two variables, like V(t, St).

Lesson 0.8: The Rules of "Normal" Infinitesimals

Why we can ignore second-order terms like (Δt)² in normal calculus. This is the key rule that is about to be broken.

Lesson 1.1: Why Normal Calculus Fails

Discover the "infinitely wiggly" path of a stock and why df/dt (a "slope") doesn't exist for a random path.

Lesson 1.2: Brownian Motion (Wiener Process Wt)

Defining our 'perfectly random' path and its 4 key properties: W₀=0, continuous path, independent increments, and the key rule: Wt - Ws ~ N(0, t-s).

Lesson 1.3: The "Weird" Scaling Property

Deriving the "typical size" of a single random step. Why does Wt ~ N(0, t) mean the Standard Deviation is √t?

Lesson 1.4: A Model for Stocks (Geometric Brownian Motion)

Building the SDE: dSₜ = μSₜdt + σSₜdWₜ. The "Drift" (μ) is the river's current, and "Diffusion" (σ) is the "jiggliness" of the water.

Lesson 2.1: The Failure of Path Length

Proving why a Brownian Motion has no derivative by showing that the "path length," Σ|ΔW|, is proportional to Σ√Δt, which goes to infinity.

Lesson 2.2: Quadratic Variation (The "Aha!" Moment)

If Σ|ΔW| fails, what about Σ(ΔW)²? Showing that Σ(ΔW)² is proportional to Σ(√Δt)² = ΣΔt = T. The "squared" path is finite.

Lesson 2.3: The New Rules of Algebra

Turning our discovery into rules for our infinitesimal steps: (dt)²=0, dt·dWt=0, and the "Weird Rule" (dWt)²=dt.

Lesson 2.4: The Itô Integral

How do these new rules change integration? We'll solve ∫WₜdWₜ to show that ∫WₜdWₜ = ½Wₜ² - ½T. That extra term is the "cost of randomness."

Lesson 3.1: Itô's Lemma (Simple Case, for f(Wt))

What's the chain rule for a function of just Wt? We'll plug ΔW into our Taylor formula and apply our "weird algebra" rules. Result: df = f'(Wₜ)dWₜ + ½f''(Wₜ)dt.

Lesson 3.2: Itô's Lemma (The Full Version, for f(t, Xt))

The "final boss" of our theory. We'll combine all our tools: the 2-variable Taylor expansion, the SDE (dSt = a dt + b dWt), and our "weird algebra" rules. We will go step-by-step, showing which of the 5 Taylor terms "survive" and which "die" (go to 0). Result: df = (∂f/∂t + a∂f/∂S + ½b²∂²f/∂S²)dt + (b∂f/∂S)dWt

Lesson 3.3: Understanding the Full Formula (Translating Math to Finance)

A physical meaning for all 4 terms in the full Itô's Lemma. (Time Decay, Drift Effect, Itô Correction, and the new Random Part).

Lesson 4.1: The "Magic Portfolio"

Constructing the delta-hedged portfolio (Π = -V + ΔS) and finding its SDE, dΠ.

Lesson 4.2: Eliminating Risk

How to choose a "magic" value for Δ that makes the entire dWt ("Random") bin equal zero. We'll solve for Δ and find that Δ = ∂V/∂S.

Lesson 4.3: Eliminating Drift μ

Plugging our "magic" Δ back into the drift part of our portfolio to watch the subjective μ term vanish completely.

Lesson 4.4: The "No-Free-Lunch" Argument & The Final PDE

Our portfolio is now risk-free, so it must earn the risk-free rate, r. We'll set our two equations for dΠ equal to each other and rearrange to get the Black-Scholes-Merton PDE.

Lesson 5.1: The Solution (The Black-Scholes Formula)

Explaining the physical meaning of the famous formula: (Expected Benefit) - (Expected Cost).

Lesson 5.2: Delta (Δ): The "Speed" of an Option

Δ = ∂V/∂S. How much V moves when S moves $1. How to use Delta to hedge.

Lesson 5.3: Gamma (Γ): The "Acceleration"

Γ = ∂²V/∂S². How much Δ moves when S moves $1. Why Gamma is the "risk of your hedge."

Lesson 5.4: Vega (ν): The "Jiggle Risk"

ν = ∂V/∂σ. How much V moves when volatility σ changes by 1%. Why "panic is good" for an option holder.

Lesson 5.5: Theta (Θ): The "Melting Ice Cube"

Θ = ∂V/∂t. The rate of time decay. The cost of waiting.

Lesson 5.6: Rho (ρ): The "Interest Rate Risk"

ρ = ∂V/∂r. The sensitivity to changes in the risk-free rate.

Lesson 6.1: The "Other Way" - Risk-Neutral Valuation

Introducing the risk-neutral measure Q and the fundamental theorem of asset pricing.

Lesson 6.2: The "Computer Way" - Monte Carlo Methods

Using simulation to price complex derivatives that have no closed-form solution.

Lesson 6.3: Fixing σ - Stochastic Volatility (Heston Model)

Modeling volatility itself as a random process to capture market dynamics like the "volatility smile."

Lesson 6.4: Fixing "No Jumps" - Jump-Diffusion (The Merton Model)

Adding "jumps" to our random walk to account for sudden market crashes and shocks.

Lesson 6.5: Fixing r - Stochastic Interest Rates (Vasicek & CIR)

Modeling the risk-free rate itself as a random process for pricing long-term bonds and derivatives.

Lesson 1.1: Introduction to Time Series

Decomposing time series into trend, seasonality, and residuals.

Lesson 1.2: Stationarity

The bedrock assumption for most time series models.

Lesson 1.3: ACF and PACF

The key diagnostic tools for identifying model structure.

Lesson 2.1: Autoregressive (AR) Models

Modeling how past values influence the present.

Lesson 2.2: Moving Average (MA) Models

Modeling how past forecast errors influence the present.

Lesson 2.3: ARMA Models

Combining AR and MA models to capture complex dynamics.

Lesson 2.4: ARIMA Models

Using differencing to model non-stationary series like stock prices.

Lesson 2.5: The Box-Jenkins Methodology

A systematic process for ARIMA model identification and estimation.

Lesson 3.1: Introduction to Volatility Modeling

Understanding volatility clustering and why it matters.

Lesson 3.2: ARCH Models

Modeling variance as a function of past shocks.

Lesson 3.3: GARCH Models

The industry-standard model for forecasting financial volatility.

Lesson 3.4: GARCH Extensions (EGARCH, GJR-GARCH)

Capturing asymmetric effects like the leverage effect.

Lesson 3.5: Capstone: Modeling S&P 500 Volatility

An end-to-end project to fit and forecast market volatility.

Lesson 4.1: Vector Autoregression (VAR) Models

Modeling the dynamic interplay between multiple time series simultaneously.

Lesson 4.2: Cointegration and Error Correction Models (VECM)

Finding stable, long-run equilibrium relationships between non-stationary series.

Lesson 4.3: State Space Models & Kalman Filters

A powerful framework for modeling systems with unobserved states.

Lesson 4.4: Structural VAR (SVAR) Models

Imposing economic theory on VAR models to identify causal shocks.

Lesson 4.5: Capstone: A Pairs Trading Strategy

Building an end-to-end pairs trading strategy based on cointegration.

Lesson 5.1: Feature Engineering for Time Series

Creating lagged, rolling, and date-based features for ML models.

Lesson 5.2: Time Series Cross-Validation

The importance of walk-forward validation and avoiding look-ahead bias.

Lesson 5.3: Forecasting with Tree-Based Models (XGBoost)

Applying powerful ensemble methods to time series data.

Lesson 5.4: Deep Learning for Time Series (RNNs & LSTMs)

An introduction to using recurrent neural networks for sequence data.

Lesson 5.5: Advanced Concepts (Meta-Labeling, Feature Importance)

Exploring modern techniques from quantitative finance for better models.