Back to glossaryGLOSSARY · Concepts

Temperature

A sampling parameter that controls randomness in LLM output. Temperature 0 = deterministic (same input always gives same output). Higher temperatures = more creative, less predictable. Default is usually 0.7. Note: Claude Opus 4.7 removed temperature/top_p/top_k as a breaking change; sampling is now adaptive.

How it works

After the LLM produces a probability distribution over the next token, temperature scales the distribution before sampling. Temperature 0 picks the highest-probability token deterministically. Higher temperatures flatten the distribution, making lower-probability tokens more likely.

Example

For a customer service agent where consistency matters, temperature 0.2 produces nearly the same response to the same question every time. For a creative writing agent, temperature 0.9 produces variation across calls.

Related terms

Need to actually use Temperature?

We build production AI systems that put these concepts to work. 30 minutes, we map your use case.