Back to glossaryGLOSSARY · Concepts

Context Window

The maximum number of tokens an LLM can see at once. Includes the system prompt, conversation history, and retrieved documents. April 2026 frontier models all offer 1M token context (Claude Opus 4.7, GPT-5.5, Gemini 3.1 Pro).

How it works

All input messages must fit in the context window. When the conversation gets long, you either summarise older messages or use long-term memory retrieval. Cost scales with context: 1M tokens of input on Claude Opus 4.7 = $5 per request.

Example

A document analysis agent processing a 200-page legal contract can fit the entire contract (~140K tokens) in the context window plus the system prompt and analysis instructions, with room left over for the response.

Related terms

Need to actually use Context Window?

We build production AI systems that put these concepts to work. 30 minutes, we map your use case.