← Back to QuantegyAI

Inside Large Language Models

Eight interactive modules · about 3–4 hours · Course 4 (Deep Learning) is the recommended prerequisite. No coding required.

Track your progress. Sign in to save module completion and your mastery scores across devices. Your progress also saves on this device automatically. Open the portal →

You have used ChatGPT and Claude. This course opens them up. A large language model is, at its heart, doing one thing over and over: predicting the next token. Everything else — embeddings, attention, the transformer — is machinery built to make that one prediction astonishingly good. Here you will build that machinery from the bottom up, and every piece runs live in your browser.

This is genuinely hands-on. You will train a real bigram model on a tiny corpus and watch it babble, then sharpen it; place tokens in an embedding space and find their nearest neighbours; turn attention weights up and down and see which earlier words a prediction leans on; run the query–key–value math of self-attention by hand-driven sliders; train a tiny neural language model live and watch its loss curve fall; and turn the temperature dial to feel the difference between dull and unhinged text. Each module also shows the matching Hugging Face / PyTorch idea — read-only, so you can recognize it later, with nothing to install now. Each ends with a short mastery check; pass it to mark the module complete.

The core idea

Module 1

Predicting the Next Token

The whole game in one move: given the words so far, what comes next? Activity: build a live bigram model from a small corpus, see the probability bars, and sample sentences from it. AI anchor: this is exactly the conditional probability from Course 1, scaled up.

Module 2

Tokens & Embeddings

Models do not see words — they see numbers. Activity: turn text into tokens, place each token as a vector on a 2D map, and find a token’s nearest neighbours by meaning. AI anchor: every prompt becomes a sequence of embeddings first.

How models read context

Module 3

Attention, Intuitively

To predict the next word, which earlier words matter? Activity: move an attention slider across a sentence and watch the model lean on some words and ignore others. AI anchor: "attention is all you need" — the idea that unlocked modern AI.

Module 4

How Self-Attention Works

Open the box: queries, keys, and values. Activity: set a query and watch the dot-product scores become softmax weights that blend the values into one output vector. AI anchor: the actual computation inside every transformer layer.

Module 5

The Transformer Block

Stack the parts into the unit that repeats dozens of times in a real LLM. Activity: walk a token through positional encoding, self-attention, a residual add, and a feed-forward layer. AI anchor: GPT and Claude are deep stacks of this one block.

Making it generate

Module 6

Training a Tiny Language Model

Where does the "knowledge" come from? Activity: train a real neural language model on a small text, epoch by epoch, and watch the loss curve fall as its samples get more coherent. AI anchor: the same gradient descent from Course 4, applied to language.

Module 7

Sampling & Generation

The model gives probabilities — how do they become text? Activity: turn the temperature dial and switch on top-k and top-p sampling, watching the output move from robotic to creative to incoherent. AI anchor: the settings behind every chatbot reply.

Capstone

Module 8 · Capstone

Why LLMs Hallucinate & How to Use Them Well

Put it together: a model trained to sound fluent is not trained to be true. Activity: see how a confident wrong answer is generated, what the context window can and cannot hold, and turn that into practical habits for trusting and verifying AI. A synthesis check ties every module together.

Why this matters This is the course that connects everything. The next-token prediction from Course 1, the vectors from Course 2, the gradient descent from Course 3, the deep network from Course 4 — a large language model is all of them at once. After this, "AI" is no longer a black box: you know what is happening inside the tools you use every day.

← Back to QuantegyAI