How it works
All input messages must fit in the context window. When the conversation gets long, you either summarise older messages or use long-term memory retrieval. Cost scales with context: 1M tokens of input on Claude Opus 4.7 = $5 per request.
Example
A document analysis agent processing a 200-page legal contract can fit the entire contract (~140K tokens) in the context window plus the system prompt and analysis instructions, with room left over for the response.
