Overview
LLM Engineering From Scratch is a public learning series that rebuilds core LLM mechanics one project at a time. Each project pairs a small runnable implementation with plots, stress cases, explanations, and a reproducible artifact.
Roadmap inspiration: Ahmad Osman (@TheAhmadOsman) and his article, “Step-By-Step LLM Engineering Projects (2026 Edition)”. The repository uses the article as a roadmap reference; the implementations, experiments, traces, demos, and writeups are independent.
Links
Status
The first project, Tokenizer From Scratch, is implemented with a byte-level BPE tokenizer, deterministic artifacts, and an interactive static demo. The planned sequence continues through embeddings, positional methods, attention, Transformer blocks, training loops, and objectives.
Series Pattern
Each project ships five kinds of evidence:
- Implementation - readable Python from scratch.
- Notebook - a runnable experiment and explanation path.
- Plots - charts that show behavior instead of only claiming it.
- Failure gallery - examples where the implementation gets stressed.
- Article/demo - a technical post with an interactive or visual artifact.
Roadmap
| # | Project | Hard concept | Status |
|---|---|---|---|
| 1 | Tokenizer from scratch | Tokenization is a learned compression tradeoff. | Implemented |
| 2 | One-hot vectors and learned embeddings | IDs gain meaning through learned vector geometry. | Planned |
| 3 | Positional methods | Attention needs order. | Planned |
| 4 | Scaled dot-product attention | Attention is weighted retrieval from context. | Planned |
| 5 | Multi-head attention | Heads can learn different relational patterns. | Planned |
| 6 | One decoder block | LLM behavior emerges from interacting parts. | Planned |
| 7 | Mini-former | The training loop is the lesson. | Planned |
| 8 | Language-model objectives | Objective choice shapes capabilities and failures. | Planned |
Why It Matters
Frameworks are useful, but they can hide the mechanisms that make LLM systems work or fail. This series makes those mechanisms visible through code, plots, traces, failure cases, and short explanations that compound into a deeper engineering foundation.
