The state of LangChain
LangChain peaked in 2023. It made building LLM apps approachable, but it also became infamous for moving APIs, awkward abstractions, and a meta-framework feel that hid what was really happening.
Two things changed. The frontier models (Claude 3.7+, GPT-4o, GPT-4.1) got tool-use right at the API layer, so the wrapper-heavy LangChain pattern started to feel like extra steps. And LangGraph (from the LangChain team) emerged as the cleaner answer for stateful, multi-step agents.
We still use LangChain. Just narrowly, for chains and prompt templates. For real agents we reach for LangGraph or write the orchestration ourselves.
1. LangGraph
LangChain's own successor for stateful agents. Graph-based, with explicit state and transitions. Plays well with LangSmith for observability.
When to use: stateful multi-step agents, multi-agent systems with explicit handoffs, when you want LangSmith tracing.
When not to use: trivial single-shot tasks, when you want zero LangChain dependencies.
2. LlamaIndex
RAG-first. Better default chunking and embedding strategies than LangChain's retrieval module. Cleaner ingestion pipelines.
When to use: the core problem is retrieval over your documents. Q&A bots, knowledge bases, document analysis.
When not to use: the core problem is action, not retrieval. Use it for the retrieval module and a different framework for the agent layer.
3. CrewAI
Multi-agent collaboration as a first-class concept. Roles, tasks, processes baked into the API. Friendly for non-developers but production-grade for teams that follow the patterns.
When to use: the workflow naturally splits into specialised agents (researcher, writer, reviewer). Content ops, sales workflows, support triage.
When not to use: single-agent tasks, when the "crew" abstraction adds ceremony without payoff.
4. Microsoft Semantic Kernel
Microsoft's LLM orchestration framework. Strongest in .NET environments, also has Python and Java support. Plugins, planners, and memory modules.
When to use: you're a Microsoft shop, .NET-heavy, want Azure OpenAI and Copilot integration.
When not to use: Python-first teams, anything not running on Azure, when LangGraph would do the job in fewer files.
5. Haystack
Search-first orchestration. Strong on RAG, document QA, and pipelines that combine retrieval, reranking, and generation.
When to use: you're building a search product or document QA at scale, you want strong rerankers, you want enterprise track record.
When not to use: agentic builds, anything where retrieval is not the central problem.
6. AutoGen
Microsoft Research project. Multi-agent conversations as the central pattern. Excellent for research, less obvious for production.
When to use: research-grade multi-agent systems, exploring conversational agent patterns, when you want flexible agent composition.
When not to use: production with hard SLOs, when you need straightforward observability, when CrewAI or LangGraph would be enough.
7. Custom orchestrator
Honestly, our most common choice for simple agents. A few hundred lines of TypeScript or Python that does prompt formatting, tool calling, retry, and logging. No framework.
When to use: the agent is well-scoped, the team is happy with the dependency cost, you want full control of the prompt loop.
When not to use: complex multi-agent state, when you'd be re-implementing LangGraph from scratch.
A quick code comparison
Same task, four ways: an agent that takes a user question, calls a search tool, and returns an answer.
LangChain (legacy pattern):
from langchain.agents import initialize_agent, Tool
from langchain.llms import ChatAnthropic
tools = [Tool(name="search", func=search_fn, description="Search the web")]
agent = initialize_agent(tools, ChatAnthropic(), agent="zero-shot-react-description")
result = agent.run(question)LangGraph (current pattern):
from langgraph.graph import StateGraph
from langchain_anthropic import ChatAnthropic
graph = StateGraph(AgentState)
graph.add_node("call_model", lambda s: {"messages": [llm.invoke(s.messages)]})
graph.add_node("call_tool", lambda s: {"messages": [search_fn(s.messages[-1])]})
graph.add_conditional_edges("call_model", should_continue)
agent = graph.compile()
result = agent.invoke({"messages": [HumanMessage(question)]})Custom (TypeScript):
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const messages = [{ role: 'user', content: question }];
let response = await client.messages.create({
model: 'claude-sonnet-4-6',
tools: [{ name: 'search', input_schema: searchSchema, description: '...' }],
messages,
});
while (response.stop_reason === 'tool_use') {
const toolUse = response.content.find(c => c.type === 'tool_use');
const toolResult = await search(toolUse.input);
messages.push(
{ role: 'assistant', content: response.content },
{ role: 'user', content: [{ type: 'tool_result', tool_use_id: toolUse.id, content: toolResult }] }
);
response = await client.messages.create({ model: 'claude-sonnet-4-6', tools, messages });
}Our default in April 2026
Single-agent simple builds: custom TypeScript with the Anthropic SDK or Vercel AI SDK. Multi-step agents with explicit state: LangGraph. RAG layer: LlamaIndex. Multi-agent collaboration: CrewAI when the role abstraction fits, LangGraph when it doesn't.
We rarely reach for vanilla LangChain anymore. When we do, it's for prompt templates and chains, not for the agent layer.
