Module 4 — Vectors & Data
Every piece of data a model sees — an image, a sentence, a customer — is turned into a vector: a list of numbers. Once data is vectors, "how similar are these two things?" becomes a geometry question with a precise answer. This module builds vectors, the dot product, and the similarity measure that lets a language model know "king" and "queen" are related.
A vector is a list of numbers — and an arrow
A vector is an ordered list: \( \mathbf{x} = [x_1, x_2, \ldots, x_n] \). Each number is a feature — a measurable property. A house might be \( [1500, 3, 2] \) (square feet, bedrooms, baths); a word in a language model might be 768 numbers capturing its meaning. With two features we can draw the vector as an arrow on a plane, which is how we'll build intuition.
Drag the two arrows below. Everything else — length, dot product, similarity — updates live.
This activity needs JavaScript. The lesson below still covers everything.
Length (norm): how big is a vector?
The norm \( \lVert \mathbf{x} \rVert \) is the arrow's length — straight from the Pythagorean theorem:
It measures magnitude — how far the data point sits from the origin. Models often normalize vectors (scale them to length 1) so that comparisons are about direction, not size.
The dot product: the workhorse of ML
The dot product multiplies matching entries and adds them up — a single number:
It is large and positive when two vectors point the same way, zero when they're perpendicular (unrelated), and negative when they point oppositely. That one number — a weighted sum — is literally what a single neuron computes, and what powers search, recommendations, and attention in transformers.
Cosine similarity: direction without size
To ask "do these point the same way?" while ignoring length, divide the dot product by both norms. The result is the cosine of the angle between them — cosine similarity:
It runs from +1 (same direction) through 0 (perpendicular) to −1 (opposite). The playground shows it as you drag — line the arrows up and watch it climb to 1.
Make the call
Given two vectors, predict their dot product's sign and roughly how similar they are. You'll get a score.
This activity needs JavaScript.