Agent Cookbooks
Tracing and Evaluating Agents

Agents Cookbook
Build a customer support agent to trace activity, assess performance, and experiment with prompts and models.

Evaluate an Agent
Trace and evaluate a "talk-to-your-data" agent. Includes evaluations for function calling accuracy, SQL query generation, code generation, and agent execution path.

OpenAI Agents SDK Cookbook
Create an agent with the OpenAI Agents SDK, trace its activity, benchmark with datasets, run experiments, and evaluate traces in production.

Evaluating Agents with Ragas
Create a customer support agent using the OpenAI Agents SDK, trace its interactions, and evaluate performance using Ragas.

Tracing and Evaluating Amazon Bedrock Agents
Build an Amazon Bedrock agent, instrument and trace it with Phoenix, and add evaluations to your agent traces.
.avif?alt=media&token=38ff107b-a8ba-48e0-9404-6b27263f651a)
Tracing and Evaluating a LangChain OpenAI Agent
Build your own LangChain OpenAI agent using the function-calling API and inspect the agent's internals—all in a minimal setup with conversation and tool use.
.avif?alt=media&token=86352861-ac6e-4135-94ba-a948375cce89)
Tracing and Evaluating a LlamaIndex OpenAI Agent
Use the function-calling API to create a LlamaIndex OpenAI agent capable of conversation and tool use, and explore its behavior with Phoenix.
Last updated
Was this helpful?