How to: Evals
Categorical evaluator (llm_classify)
Numeric evaluator (llm_generate)
Run evaluations via a job to visualize in the UI as traces stream in.
Evaluate traces captured in Phoenix and export results to the Phoenix UI.
Evaluate tasks with multiple inputs/outputs (ex: text, audio, image) using versatile evaluation tasks.
Last updated
Was this helpful?