← All AI Agents & Tool Use modules

Module 7 — Guardrails & Reliability

Making it reliable · hands-on · about 30 minutes.

Everything so far assumed the agent behaves. Real agents do not — not always. The model picks the wrong tool, or calls the same one forever, or, worst of all, takes a real-world action you did not want. An agent loops in the same way a person stuck on a bad assumption does: confidently, and without noticing. The difference between a demo and a system you can trust is the guardrails wrapped around the loop — the checks that catch failure before it does damage.

Four guardrails every serious agent needs

Break an agent, then add guardrails

Pick a failure scenario and run it with guardrails off — watch the agent misbehave. Then switch the relevant guardrail on and run again: the trace shows it getting caught and stopped safely. The red BLOCKED lines are the guardrails doing their job.

This activity needs JavaScript. The lesson below still covers everything.

Guardrails in code — read only, nothing to install
for step in range(MAX_STEPS):              # 1. hard step cap
    action = decide(state)
    if action.name not in tools:          # 3. tool validation
        break
    if action in already_tried:           # 2. loop detection
        break
    if action.is_risky and not human_ok(action):  # 4. human-in-the-loop
        break
    observation = tools[action.name](action.arg)
    already_tried.add(action)

None of these make the agent smarter — they make it safe. A capable agent without guardrails is a liability; the guardrails are what let you actually deploy one.

AI anchor — guardrails are why agents can ship The reason production agents are trusted with real tasks is not that the model never errs — it is that the system around it assumes it will. Step caps stop runaway bills. Loop detection stops an agent burning cycles. Allow-lists keep it from calling tools it should not. And human-in-the-loop confirmation on irreversible actions — "are you sure you want to send this?" — is the single most important safety pattern in agentic AI. The capability comes from the model; the trust comes from the guardrails.

Check your understanding

A few questions about guardrails. You will get a score.

This activity needs JavaScript.

Why this matters next You now have every piece — loop, tools, ReAct, memory, planning, routing, and guardrails. Module 8 assembles them into one complete agent on a real task, and then asks the question that separates good engineers from hype: when should you NOT use an agent at all?
One-sentence summary: agents fail — they loop, misroute, and take unwanted actions — so reliability comes from guardrails around the loop: a hard step cap, loop detection, tool validation, and human-in-the-loop approval for risky or irreversible actions.

Next: Build an Agent — and Know When Not To →