Why I’m Interested in Causal ML?
Overview
My interest in Causal Machine Learning (Causal ML) comes from a simple tension: many models can predict well, but prediction alone does not answer the questions I care about - what happens if we intervene? and how sure are we?
In Statistical Modeling: The Two Cultures (2001), Leo Breiman described two traditions that often talk past each other:
| Culture | Primary goal | Typical focus |
|---|---|---|
| Data modeling (statistics/econometrics) | explanation + inference | assumptions, parameters, uncertainty |
| Algorithmic modeling (machine learning) | prediction + generalization | performance on unseen data |
Causal ML is compelling to me because it aims to combine the strengths of both: flexible prediction models with principled causal questions and uncertainty quantification.
Focus of This Book
This online book is my mathematical learning journey toward understanding Causal ML from first principles. My goal is to rebuild the foundations carefully and honestly - writing only what I have actually studied - so that later I can read and use causal methods with clarity and rigor.
The Foundational Path
This book is my mathematical learning journey toward understanding Causal ML from first principles.
Each step builds exactly the structure required for the next - nothing ornamental, nothing skipped.
| Phase | Topic | Focus |
|---|---|---|
| 1 | Logic & Set Theory | Proof techniques, quantifiers, mathematical language |
| 2 | Linear Algebra | Inner products, projections, spectral ideas; geometry of estimation |
| 3 | Real Analysis (ℝ) | Limits, continuity, differentiation, compactness - aka: Calculus |
| 4 | Metric Spaces | Convergence, open/closed sets, completeness; minimal topology |
| 5 | Measure Theory | σ-algebras, integration, product measures |
| 6 | Probability (measure-theoretic) | Probability spaces, random variables, convergence modes, LLN/CLT, conditional expectation |
| 7 | Mathematical Statistics | Likelihood, asymptotics, efficiency, influence functions |
| 8 | Decision & Risk | Loss, optimality, statistical decision theory |
| 9 | Uncertainty Quantification | Parameter uncertainty, predictive distributions, bootstrap, conformal inference, robustness |
| 10 | Statistical Learning Theory | ERM, generalization, regularization, high-dimensional behavior |
| 11 | Causal Inference + Causal ML | Identifiability, orthogonality, double ML, valid uncertainty for treatment effects |
Why Mathematical Rigor Matters to Me
In a short interview often titled “Why?”, Richard Feynman is asked a seemingly simple question:
“Why do magnets repel each other?”
Instead of giving a comforting mechanical story, he does something deeper.
He explains that every answer depends on what you are willing to accept as fundamental.
If he says:
“Because of electromagnetic forces,”
the interviewer can still ask:
“Why electromagnetic forces?”
If he explains using quantum electrodynamics:
“Because of exchange particles,”
one can still ask:
“Why that law?”
Eventually, explanation stops being reduction and becomes:
“This is how nature behaves. This is the framework.”
Feynman’s key insight is subtle but profound:
There is no explanation without a conceptual framework.
Every “why” presupposes some deeper structure that we are not questioning.
This resonates deeply with me.
In data science and machine learning, it is possible to use powerful tools without fully understanding the structures that justify them. A model may perform well, an estimator may seem stable, an interval may look convincing.
But if I cannot explain why it works - in terms of convergence, projection, identifiability, or probability - then I am operating at a surface layer.
For me, mathematical rigor is not about abstraction for its own sake.
It is about knowing which layer I am standing on.