How it works
Top-p 0.9 means: rank all tokens by probability, take the smallest set whose probabilities sum to 0.9, sample from that set. Filters out the long tail of unlikely tokens. Common combination: temperature 0.7 + top_p 0.95.
Example
Top-p 0.5 is restrictive (only the most likely tokens are candidates), producing focused output. Top-p 0.99 is permissive (almost all tokens are candidates), producing varied output.
