Overview

LLM Engineering From Scratch is a public learning series that rebuilds core LLM mechanics one project at a time. Each project pairs a small runnable implementation with plots, stress cases, explanations, and a reproducible artifact.

Roadmap inspiration: Ahmad Osman (@TheAhmadOsman) and his article, “Step-By-Step LLM Engineering Projects (2026 Edition)”. The repository uses the article as a roadmap reference; the implementations, experiments, traces, demos, and writeups are independent.

Status

The first project, Tokenizer From Scratch, is implemented with a byte-level BPE tokenizer, deterministic artifacts, and an interactive static demo. The planned sequence continues through embeddings, positional methods, attention, Transformer blocks, training loops, and objectives.

Series Pattern

Each project ships five kinds of evidence:

  1. Implementation - readable Python from scratch.
  2. Notebook - a runnable experiment and explanation path.
  3. Plots - charts that show behavior instead of only claiming it.
  4. Failure gallery - examples where the implementation gets stressed.
  5. Article/demo - a technical post with an interactive or visual artifact.

Roadmap

#ProjectHard conceptStatus
1Tokenizer from scratchTokenization is a learned compression tradeoff.Implemented
2One-hot vectors and learned embeddingsIDs gain meaning through learned vector geometry.Planned
3Positional methodsAttention needs order.Planned
4Scaled dot-product attentionAttention is weighted retrieval from context.Planned
5Multi-head attentionHeads can learn different relational patterns.Planned
6One decoder blockLLM behavior emerges from interacting parts.Planned
7Mini-formerThe training loop is the lesson.Planned
8Language-model objectivesObjective choice shapes capabilities and failures.Planned

Why It Matters

Frameworks are useful, but they can hide the mechanisms that make LLM systems work or fail. This series makes those mechanisms visible through code, plots, traces, failure cases, and short explanations that compound into a deeper engineering foundation.