Phoenix: AI Observability & Evaluation
Evaluate, troubleshoot, and fine tune your LLM, CV, and NLP models in a notebook.
Phoenix is an Open Source Observability library designed pre-production experimentation, evaluation, and troubleshooting.
Running Phoenix for the first time? Select a quickstart below.
Don't know which one to choose? Phoenix has two main data ingestion methods:
- Evaluate Performance of LLM Tasks with Evals Library: Use the Phoenix Evals library to easily evaluate tasks such as hallucination, summarization, and retrieval relevance, or create your own custom template.
- Optimize Retrieval Systems: Identify missing context in your knowledge base, and when irrelevant context is retrieved by visualizing query embeddings alongside knowledge base embeddings with RAG Analysis.
Check out a comprehensive list of example notebooks for LLM Traces, Evals, RAG Analysis, and more.
Learn about best practices, and how to get started with use case examples such as Q&A with Retrieval, Summarization, and Chatbots.
Join the Phoenix Slack community to ask questions, share findings, provide feedback, and connect with other developers.