"A production model is only as reproducible as the data graph that produced it."
Overview
Data versioning and lineage turn training data from an invisible dependency into an auditable mathematical artifact.
Production ML and MLOps are the mathematical discipline of keeping a learned system useful after it leaves the notebook. The model is only one artifact in a larger graph of data, code, configuration, evaluation, deployment, monitoring, and response actions.
This chapter uses LaTeX Markdown throughout. Inline mathematics uses $...$, and display equations use `
`. The central habit is to turn production behavior into explicit objects: versions, hashes, traces, thresholds, queues, contracts, and release decisions.
Prerequisites
- Documentation and Governance
- Contamination and Dedup Audits
- Error Analysis and Ablations
- Human in the Loop and Monitoring
Companion Notebooks
| Notebook | Description |
|---|---|
| theory.ipynb | Executable demonstrations for data versioning and lineage |
| exercises.ipynb | Graded practice for data versioning and lineage |
Learning Objectives
After completing this section, you will be able to:
- Define production ML artifacts using mathematical notation
- Represent dependencies as auditable graphs and contracts
- Compute simple production statistics with synthetic data
- Separate offline evaluation from online monitoring
- Design release gates that combine quality, safety, latency, and cost
- Explain how versioning enables rollback and reproducibility
- Diagnose drift, skew, and production regressions
- Connect LLM traces to evaluations, guardrails, and retraining data
- Identify operational failure modes before they become incidents
- Build lightweight notebook simulations of production ML behavior
Study Flow
- Read the pages in order and pause after each page to restate the main definition or theorem.
- Run
theory.ipynbwhen you want to check the formulas numerically. - Use
exercises.ipynbafter the reading path, not before it. - Return to this overview page when you need the chapter-level navigation.