Logs

Logs contain all data associated with your AI application, including inputs and outputs to your LLM systems. Note that this does not include telemetry and metadata surrounding function executions, which is part of spans. Evals are performed on log data, which includes

User inputs to LLMs and agents
LLM and agent outputs
Documents returned by retrieval systems
Intermediary outputs in chained calls

A log can be part of an experiment, or represent a single execution (for example, in a live monitoring configuration). In the following sections, we describe how to log a single AI execution and obtain evaluation results.

See the Experiments section to learn more about how to run batches of logs in experiments to optimize performance over a testing set.

See the Evals section to read more about how evaluators use logs to produce evaluation results, and how to use evaluation results to drive improvements in your AI application.

Logs

On this page