GLOSSARY

AI Glossary.

Plain-English definitions of 30 terms we use building AI agents, MVPs, and automations. No jargon you don't need.

Agentic AI

Concepts

Software that uses LLMs to plan, decide, and act over multiple steps, often calling tools to interact with other systems. Different from a chatbot, which only responds to messages inside a conversation.

AI Agent

Agents

Software that takes a goal, uses an LLM to plan how to reach it, calls tools as needed, and continues until the goal is met or it gives up. The unit of work in agentic AI.

Claude

Models

Anthropic's family of LLMs, including Claude Sonnet, Claude Opus, and Claude Haiku. Strong on long-context reasoning, writing quality, and tool use. The model DK Studio uses by default.

Context Window

Concepts

The maximum number of tokens an LLM can see at once. Includes the system prompt, conversation history, and retrieved documents. Frontier models in 2026 range from 200K (Claude) to 2M (Gemini) tokens.

CrewAI

Tools

Open-source multi-agent framework. First-class support for roles, tasks, and processes. Strong when the workflow naturally splits into specialised agents collaborating.

Embedding

Concepts

A numeric vector representation of text (or image, or audio) that captures semantic meaning. Two embeddings close in vector space mean two pieces of content close in meaning. Generated by embedding models.

Evaluation

Concepts

The practice of measuring whether an LLM output is good. In production, eval sets are curated examples with expected outputs, run on every prompt change to catch regressions.

Fine-tuning

Concepts

Continuing to train a base LLM on your own data so it learns your domain or style. Less common in 2026 than in 2023 because frontier models are good enough out of the box for most use cases. Useful for specialised tasks where prompting is not enough.

Related:LLM, Token

Function Calling

Concepts

A specific implementation of tool use where the LLM emits structured JSON describing which function to call and with what arguments. The runtime then executes the function and feeds the result back to the LLM.

Related:Tool Use

Gemini

Models

Google's LLM family, including Gemini 1.5 Pro, Gemini 2.0 Flash, and Gemini 2.5. Multimodal-first, deeply integrated with Google Cloud. Strong cost/performance ratio.

Related:LLM, Google

GPT

Models

Generative Pre-trained Transformer. OpenAI's family of LLMs, including GPT-4o and GPT-4.1. The reference architecture for modern decoder-only language models.

Related:LLM, OpenAI, Token

Hallucination

Concepts

When an LLM produces output that sounds confident but is factually wrong. The fundamental challenge of using LLMs for anything with stakes. Mitigated by RAG, citations, and human-in-the-loop review.

Related:RAG, Evaluation

LangChain

Tools

An open-source framework for building LLM applications. Provides chains, agents, retrievers, and integrations. Once dominant, now mostly used narrowly. LangGraph is the modern successor for agents.

LangGraph

Tools

LangChain's stateful agent framework. Builds agents as graphs with explicit state and transitions. The current default for production multi-step agents, with LangSmith integration for observability.

LLM

Models

Large Language Model. A neural network trained to predict the next token, scaled up to billions of parameters. The brain in most AI agents and chatbots in 2026.

Related:GPT, Claude, Gemini

Memory (Long-term)

Concepts

External storage of facts, preferences, or events that persists across sessions. Usually in a vector database or structured store. The agent retrieves relevant memories at the start of each turn.

Memory (Short-term)

Concepts

The conversation history kept inside an LLM's context window during a single session. Resets when the conversation ends. The default kind of memory in most chatbots.

Multi-agent

Concepts

Architecture where multiple specialised AI agents collaborate to handle a workflow. One agent plans, another researches, another writes, another reviews. More complex than single-agent, but powerful when the task splits naturally.

n8n

Tools

Open-source workflow automation tool. Self-hostable, with 400+ integrations. The orchestration layer DK Studio uses for ~70% of automation builds.

Observability

Tools

The ability to see what an LLM or agent did and why. Includes logging prompts, tool calls, costs, and latencies. Critical for debugging non-deterministic systems. LangSmith and Helicone are common tools.

Related:Evaluation

Planning

Concepts

An agent's ability to break a goal into sub-tasks and decide the order. The hardest thing for LLMs to do reliably; the difference between a working agent and one that loops.

Prompt Engineering

Concepts

The practice of crafting prompts to get better outputs from LLMs. Includes structuring instructions, providing examples (few-shot), and chaining prompts. The cheap optimisation before fine-tuning.

RAG

Concepts

Retrieval-Augmented Generation. Pattern where the LLM doesn't answer from its weights alone but retrieves relevant chunks from a knowledge base first, then generates an answer grounded in those chunks. The default architecture for Q&A on private data.

Reasoning

Concepts

The process of an LLM working through a problem step by step before answering. Modern frontier models reason internally; some expose the reasoning trace explicitly (like Claude's thinking blocks or o1's chain-of-thought).

Related:Planning, LLM

System Prompt

Concepts

Instructions given to an LLM before any user input. Defines the model's role, constraints, and tone. Usually invisible to the end user. The most important and most underrated optimisation.

Temperature

Concepts

A sampling parameter that controls randomness in LLM output. Temperature 0 = deterministic (same input always gives same output). Higher temperatures = more creative, less predictable. Default is usually 0.7.

Related:Top-p

Token

Concepts

The unit of input and output for an LLM. A token is roughly 3-4 characters in English (e.g. "hello" is one token, "hellos" is two). Models charge per input and output token.

Tool Use

Concepts

When an LLM calls external functions or APIs as part of its response. The mechanism that turns an LLM from a text generator into an agent that can act. Also called function calling.

Top-p

Concepts

Another sampling parameter, also called nucleus sampling. Limits the model to picking from the smallest set of tokens whose cumulative probability is at least p. Used alongside temperature to shape output distribution.

Related:Temperature

Vector Database

Tools

Database that stores embeddings (numeric vectors) and supports nearest-neighbor search. Used in RAG to find documents similar in meaning to a query. Examples: Pinecone, Weaviate, pgvector, Qdrant.

Related:Embedding, RAG

Term we should add?

Drop us a line. We update this glossary every couple of weeks based on what comes up in real client conversations.

hello@dkstudio.ai