Module 1 — The Agent Loop

The core loop · hands-on · about 25 minutes.

A chatbot produces a single response and terminates. An agent continues: it executes an action, examines the result, and selects its next action — iteratively — until its goal is achieved. This iterative cycle is the central concept of this course. Beneath the variation in terminology, every agent architecture reduces to a loop comprising four operations:

Perceive — read the current situation (the goal, and whatever the last action revealed).
Decide — choose the next action. This is where a real agent calls a language model; here it follows a simple rule.
Act — actually do the thing (call a tool, make a guess, take a step).
Observe — read the result of that action, and feed it back into the next "perceive."

The crucial element is the feedback loop: because each observation flows into the next decision, the agent adapts. It is not running a fixed script — it is reacting, step by step, to a world it cannot fully see ahead of time.

Execute the loop interactively

Below is the simplest possible agent. It has one goal — guess a hidden number between 1 and 100 — and one move: guess a number and get told "too high," "too low," or "correct." Watch how it perceives the range it knows, decides on the middle, acts, and observes the answer narrowing the range. Press Step to walk one perceive–decide–act–observe cycle at a time, or Run to let it finish.

This activity needs JavaScript. The lesson below still covers everything.

Why the midpoint policy is optimal

The agent does not guess arbitrarily. By selecting the midpoint of the range it still considers feasible at each step, it eliminates half of the remaining candidates with every guess — converging on any number in 1–100 within at most seven iterations. This decision rule constitutes the agent's policy: a mapping from the current observation to the optimal next action. In a deployed agent, this policy is implemented by a language model; the surrounding loop is identical.

The equivalent loop expressed in code

while not done:
    obs   = perceive(state)        # what do I know right now?
    action = decide(obs)           # the "brain" — an LLM in a real agent
    result = act(action)           # take the step / call the tool
    state  = observe(result, state) # fold the result back in, then loop

Every agent framework — LangChain, the OpenAI Assistants API, Claude's tool use — is a more elaborate implementation of these four operations. The decide step is augmented with sophisticated language models; the surrounding loop retains precisely this structure.

AI anchor — this loop underlies every modern agent system When an AI coding assistant modifies a file, when a research agent retrieves documents from the web, when a computer-use agent operates a graphical interface — each is executing this loop. Perceive the state, decide an action (the language-model call), act upon the environment, observe the result, repeat. The capability gap between an agent and a single-turn chatbot is not attributable to a more capable underlying model; it arises because the model is embedded in a feedback loop that permits action, observation of the consequences, and revision.

Check your understanding

Answer a short set of questions on the agent loop.

This activity needs JavaScript.

Why this matters next The agent introduced here was restricted to a single action. In production systems, an agent's actions are tool calls — invocations of external functions such as calculators, data retrievers, or calendar APIs. Module 2 introduces the agent's first concrete tool and the decision rule by which it elects to invoke a tool rather than respond directly.

Summary: an agent is a decision rule (typically a language model) embedded in a perceive–decide–act–observe loop in which each observation informs the subsequent decision, enabling iterative adaptation toward a goal rather than a single-turn response.

Next: Calling Tools →

🎮 You ARE the agent

Now play the agent yourself. At each step of the loop you pick the right action. Wrong choices cost a life. Can you find the secret number as efficiently as the algorithm?

This activity needs JavaScript.