Online Evals
Run evaluations on your trace and span data
Last updated
Was this helpful?
Run evaluations on your trace and span data
Last updated
Was this helpful?
Evaluations help you understand your LLM application performance. You can measure your application across several dimensions such as correctness, hallucination, relevance, faithfulness, and latency. This helps you ship LLM applications that are reliable, accurate, and fast.
As your application grows, manually inspecting tens of thousands of rows of traces and spans becomes unwieldy. Online evaluation tasks automatically tag new spans with evaluation labels, to help you find problematic spans and understand performance.
Run evaluations in the UI
Run evaluations with code
Read our guide on agent evals