How it works
GPT-5 is a multi-model deployment: a router decides whether your query needs the standard model, the Thinking variant (longer chains of internal reasoning), or the Pro variant (highest capability, longest context).
Example
A customer-facing agent uses the standard GPT-5 by default for cost efficiency, automatically escalates to the Thinking variant when the query requires multi-step reasoning, and falls back to Pro for the rare cases that need maximum capability.
