Definitions worth getting right
A chatbot is a conversational UI on top of an LLM. The user types, the model responds, the conversation thread holds the state. Examples: a help-centre bot, a sales-qualification bot, a copilot in a sidebar.
An AI agent is software that uses an LLM to decide what to do next, given a goal. It plans, calls tools, reads the results, decides again, and either continues or stops. The LLM is the brain. The orchestration layer is the body. Examples: a research agent that pulls data from your CRM, your billing system, and the web, then drafts a renewal pitch.
Agentic AI is the broader pattern. Usually it means multiple agents collaborating, each specialised: a planner, a researcher, a writer, a reviewer. Same architecture as a single agent, just composed.
Feature comparison matrix
| Feature | Chatbot | AI Agent | Agentic AI |
|---|---|---|---|
| Reasoning | Limited | Yes | Distributed |
| Multi-step planning | No | Yes | Yes |
| Tool use | Optional | Yes | Yes |
| Memory | Conversation | Short + long | Shared + private |
| Autonomy | None | Bounded | Bounded per agent |
| Observability needs | Low | High | Very high |
| Build cost (typical) | $5K-25K | $15K-80K | $25K-150K |
| Run cost (typical/mo) | $50-500 | $200-5K | $500-15K |
When to build a chatbot
- You have a knowledge base of FAQs and 80% of inbound is variations of the same questions.
- You need lead qualification on a marketing site, with handoff to sales.
- You want a copilot inside an app that helps users navigate features.
- You need an internal Q&A bot that answers from your own docs (RAG).
- You have low traffic and the cost of an agent is overkill.
When to build an AI agent
- The task spans multiple systems (CRM + billing + email + analytics).
- Resolution requires research and decisions, not just answers.
- You want autonomous handling of well-scoped workflows (lead enrichment, ticket triage, content drafting).
- The ROI of automation justifies the build cost (typically $50K+ in saved hours per year).
- You can tolerate non-deterministic behaviour with proper observability and fallbacks.
Production considerations either way
Chatbots and agents both fail in production. The failures look different.
Chatbots fail by hallucinating wrong answers. Mitigation: ground the bot in a retrieval layer (RAG), set a confidence threshold, and route to a human when confidence drops.
Agents fail by running off the rails (looping, calling the wrong tool, escalating cost without progress). Mitigation: hard step limits, cost caps, observability with alerting, and graceful escalation paths.
Both need eval sets. The mistake we see most: shipping without an eval set, then scrambling to build one when prod traffic surfaces edge cases. Build the eval set first.
A simple decision rule
If the task can be described as "answer this kind of question," build a chatbot. If the task can be described as "handle this kind of work," build an agent. If the task naturally splits into sub-tasks with different specialists, consider agentic AI.
Most deployments start with a chatbot and grow into an agent when the scope expands. That's fine, just don't pretend the rebuild cost is zero.
