Evals on Traces

Run evaluations on your trace and span data

Evaluations help you understand your LLM application performance. You can measure your application across several dimensions such as correctness, hallucination, relevance, faithfulness, and latency. This helps you ship LLM applications that are reliable, accurate, and fast.

As your application grows, manually inspecting tens of thousands of rows of traces and spans becomes unwieldy. Online evaluation tasks automatically tag new spans with evaluation labels, to help you find problematic spans and understand performance.

Learn more

Last updated 14 days ago

Was this helpful?