LLM: Definition & Examples (AI Glossary)

How it works

An LLM ingests a sequence of tokens (the prompt), runs them through layers of self-attention and feed-forward networks, and outputs a probability distribution over the next token. Sampling produces the next token, which is appended and the process repeats. Modern LLMs use the Transformer architecture introduced in 2017.

Example

When you ask Claude a question, your message is tokenised, the model produces a token-by-token response sampling from its learned distribution, and tool calls are emitted as structured outputs the runtime can execute.

Related terms

GPT Claude Gemini Token Context Window

Mentioned in

AI Integration Services How to Build an AI Agent

LLM

How it works

Example

Related terms

Mentioned in

Need to actually use LLM?