Most “AI Agents”
Are Just Chatbots
with Extra Steps

88% of AI pilots never reach production.
The #1 cause isn't bad models. It's wrong architecture.

By Saheb Singh · Enterprise AI, American Express. Ex-Google. CMU CS.

The Architecture Spectrum

Four things that get called “AI” — only one is an agent.

Gartner, 2025: AI Agents at Peak of Inflated Expectations. GenAI already in the Trough of Disillusionment.

Input

fixed

Step 1

fixed logic

LLM

single pass

Step 2

fixed logic

Output

fixed

No feedback loop · No planning · Not an agent

An LLM in a pipeline ≠ an agent. Gartner: only ~130 of thousands of “agent” vendors are real.

The Trap

Most 'AI Agents' Aren't Agents

Input

fixed

Step 1

fixed logic

LLM

single pass

Step 2

fixed logic

Output

fixed

No feedback loop · No planning · Not an agent

An LLM in a pipeline ≠ an agent. Gartner: only ~130 of thousands of “agent” vendors are real.

Let's start with the uncomfortable truth: most 'AI agents' in production today are agentic workflows — deterministic pipelines with an LLM at one or two steps. Gartner estimates only ~130 of thousands of 'agentic AI' vendors are real. The rest are agent-washing.

They look like agents in demos. They're marketed as agents. But they can't adapt to novel situations, don't maintain state across sessions, and follow pre-defined orchestration paths. Deloitte found only 11% of enterprises have agents in production — and of those, Menlo Ventures found only 16% are truly agentic.

This isn't pedantic. The distinction determines your architecture, your risk model, your governance requirements, and your cost structure. Get it wrong, and you'll build chatbot-level guardrails for agent-level autonomy — or agent-level overhead for a problem a simple pipeline could solve.

Chatbot

A Stateless Function

Input

string

LLM

single pass

Output

string

f(prompt) → response

○ Stateless○ No tools○ ~$0.001/query

A chatbot is a pure function: input in, output out. No memory between calls. No tools. No planning. Gmail's Smart Reply, customer service FAQ bots, most 'AI-powered' support widgets — all chatbots. Used by billions.

This isn't a criticism. The question is whether your problem requires more than a stateless text transformation. If not, you've found the cheapest, lowest-risk architecture. At ~$0.001/query, chatbots are 100-1000x cheaper than agents.

Copilot

Human-in-the-Loop

Context Window

code, docs, data

User Input

LLM

context-aware

Suggestion

Human decides · ~$0.01/query

Add a context window and a human checkpoint. A copilot suggests, a human decides. GitHub Copilot doesn't push code — a developer hits Tab or Escape. The blast radius of a bad suggestion is zero until a human approves it.

McKinsey reports ~70% of Fortune 500 use Microsoft 365 Copilot. But here's the Gen AI Paradox: horizontal copilots deliver diffuse, hard-to-measure gains. The real value is in vertical, domain-specific copilots — and 90% of those are stuck in pilot mode.

Agent

Autonomous Reasoning Loop

Goal

human-defined

ReAct Loop

Reason

think

Act

tool call

Observe

result

loops until goal met

Tools / MCP

Memory · ~$0.10-1.00/task

Remove the human from the loop. Give the system persistent memory, tool access via MCP (Model Context Protocol), and multi-step planning. It reasons about a goal, acts, observes results, adjusts — a ReAct loop.

This is where things get powerful and dangerous. ASAPP found agents fail on multi-step tasks ~70% of the time. Inference costs multiply (5-50 LLM calls per task). And autonomous systems that can take real-world actions require fundamentally different governance than suggestion engines.

The companies deploying agents successfully started with copilots, learned where humans add value and where they don't, and gradually widened the autonomy boundary. They earned the right to automate.

What happens when you get it wrong

War Story

The $2.1M “Agent” That Was Really a Chatbot

A real scenario. Anonymized details, real architecture decisions, real consequences. This is what happens when you deploy agent-level autonomy with chatbot-level governance.

Week 1The Build

Engineering team at a Series C fintech builds an 'AI agent' for customer support escalation. The system reads tickets, queries internal docs, and drafts responses. Leadership calls it their 'autonomous support agent' on the earnings call.

1 / 5

Next issue

The Agentic AI Quality Crisis

57% of teams have agents in production. Only 37% evaluate if their outputs are correct. The quality gap nobody talks about.

Next issue: March 3 · Free · Unsubscribe anytime

Free · Every other Tuesday · 5-min read

So what should you actually build?

Decision Framework

What should you actually build?

The answer isn't always “agents.” Walk through these questions. Be honest — the right architecture is the simplest one that solves your actual problem.

What does your AI system need to do?

Start with the problem, not the technology. The right architecture follows from the requirements.

Governance Audit

7 questions before you deploy an agent.

Gartner expects 80%+ of unauthorized AI transactions to be internal violations — not external attacks. The risk is already inside the building. If you can't answer all seven, you're not ready.

The uncomfortable truth: Most organizations jumping to agents don't have the governance infrastructure to handle autonomous AI. The companies that deploy agents successfully started with copilots, learned where humans add value and where they don't, and gradually widened the autonomy boundary. They earned the right to automate. McKinsey calls it the “Gen AI Paradox” — 80% of companies with agents see no EBIT impact. The ones that do spent 70% of their effort on people and process, not model selection.

Most “AI Agents”Are Just Chatbots with Extra Steps

Four things that get called “AI” — only one is an agent.

Most 'AI Agents' Aren't Agents

A Stateless Function

Human-in-the-Loop

Autonomous Reasoning Loop

The $2.1M “Agent” That Was Really a Chatbot

The Agentic AI Quality Crisis

What should you actually build?

7 questions before you deploy an agent.

Most “AI Agents”
Are Just Chatbots
with Extra Steps