Back to glossaryGLOSSARY · Tools

Observability

The ability to see what an LLM or agent did and why. Includes logging prompts, tool calls, costs, and latencies. Critical for debugging non-deterministic systems. April 2026 standards: Langfuse, LangSmith, Helicone, plus OpenTelemetry GenAI semantic conventions for trace standardisation.

How it works

Observability tools intercept LLM calls and tool calls, log them with structured metadata (inputs, outputs, latency, cost, errors), store them for later querying, and alert on anomalies. Production agents log every prompt, every model response, every tool call with inputs and outputs.

Example

When a customer service agent goes wrong (resolves a ticket incorrectly), the on-call engineer pulls the trace in Langfuse: sees the system prompt, the customer message, the tool calls in order, the tool outputs, and the final response — debugs in 5 minutes instead of 5 hours.

Related terms

Need to actually use Observability?

We build production AI systems that put these concepts to work. 30 minutes, we map your use case.