Release Notes
The latest releases from the Phoenix team.
Last updated
Was this helpful?
The latest releases from the Phoenix team.
Last updated
Was this helpful?
Available in Phoenix 8.6+
We've introduced the OpenAI Agents SDK for Python which provides enhanced visibility into agent behavior and performance.
This includes an OpenTelemetry Instrumentor that natively traces agents, LLM calls, tool usage, and agent handoffs. With minimal setup, you can enable tracing and gain real-time insights into your agents' workflows.
Installation
To enable tracing, use the register
function to connect your application to Phoenix. Once set up, run your agents and view traces directly in Phoenix for real-time insights.
For more details on a quick setup, check out our docs!
Available in Phoenix 8.6+
We’ve introduced several enhancements to Projects, providing greater flexibility and control over how you interact with data. These updates include:
Persistent Column Selection on Tables: Your selected columns will now remain consistent across sessions, ensuring a more seamless workflow.
Metadata Filters from the Table: Easily filter data directly from the table view using metadata attributes.
Custom Time Ranges: You can now specify custom time ranges to filter traces and spans.
Root Span Filter for Spans: Improved filtering options allow you to filter by root spans, helping to isolate and debug issues more effectively.
Metadata Quick Filters: Quickly apply common metadata filters for faster navigation.
Improvements & Bug Fixes 🐛
Prompt Playground: Now supports GPT-4.5 & Anthropic Sonnet 3.7 and Thinking Budgets
Instrumentation: SmolagentsInstrumentor to trace smolagents by Hugging Face
Experiments: Enhanced experiment filtering for better analysis
Performance: Major speed improvements in project tracing views & visibility into database usage in settings
Evals: o3 support, Audio & Multi-Modal Evaluations
Instrumentation: Tool call IDs in Llama-Index & deprecation of Langchain v0.1
Integrations: Phoenix now supports LiteLLM Proxy & Cleanlabs evals
OTEL: Auto-instrument tag & decorators for streamlined observability
Check out these docs for more!
Available in Phoenix 8.0+
Phoenix prompt management will now let you create, modify, tag, and version control prompts for your applications! These are some more highlights from this release:
Versioning & Iteration: Seamlessly manage prompt versions in both Phoenix and your codebase.
New TypeScript Client: Sync prompts with your JavaScript runtime, now with native support for OpenAI, Anthropic, and the Vercel AI SDK.
New Python Client: Sync templates and apply them to AI SDKs like OpenAI, Anthropic, and more.
Standardized Prompt Handling: Native normalization for OpenAI, Anthropic, Azure OpenAI, and Google AI Studio.
Enhanced Metadata Propagation: Track prompt metadata on Playground spans and experiment metadata in dataset runs.
Check out the docs and this walkthrough for more on prompts!📝
Available in Phoenix 8.0+
Phoenix has made it even simpler to get started with tracing by introducing one-line auto-instrumentation. By using register(auto_instrument=True)
, you can enable automatic instrumentation in your application, which will set up instrumentors based on your installed packages.
For more details, you can check the docs and explore further tracing options.
Available in Phoenix 7.9+
In addition to using our automatic instrumentors and tracing directly using OTEL, we've now added our own layer to let you have the granularity of manual instrumentation without as much boilerplate code.
You can now access a tracer object with streamlined options to trace functions and code blocks. The main two options are:
Using the decorator @tracer.chain
traces the entire function automatically as a Span in Phoenix. The input, output, and status attributes are set based on the function's parameters and return value.
Using the tracer in a with
clause allows you to trace specific code blocks within a function. You manually define the Span name, input, output, and status.
Check out the docs for more on how to use tracer objects.
Available in Phoenix 7.0+
Sessions allow you to group multiple responses into a single thread. Each response is still captured as a single trace, but each trace is linked together and presented in a combined view.
Sessions make it easier to visual multi-turn exchanges with your chatbot or agent Sessions launches with Python and TS/JS support. For more on sessions, check out a walkthrough video and the docs.
Available in Phoenix 6.0+
Prompt Playground is now available in the Phoenix platform! This new release allows you to test the effects of different prompts, tools, and structured output formats to see which performs best.
Replay individual spans with modified prompts, or run full Datasets through your variations.
Easily test different models, prompts, tools, and output formats side-by-side, directly in the platform.
Automatically capture traces as Experiment runs for later debugging. See here for more information on Prompt Playground, or jump into the platform to try it out for yourself.
We've made several performance enhancements, added new features, and fixed key issues to improve stability, usability, and efficiency across Phoenix.
Numerous stability improvements to our hosted Phoenix instances accessed on app.phoenix.arize.com
Added a new command to easily launch a Phoenix client from the cli: phoenix serve
Implemented simple email sender to simplify dependencies
Improved error handling for imported spans
Replaced hdbscan with fast-hdbscan Added PHOENIX_CSRF_TRUSTED_ORIGINS environment variable to set trusted origins
Added support for Mistral 1.0
Fixed an issue that caused px.Client().get_spans_dataframe() requests to time out
Available in Phoenix 5.0+
We've added Authentication and Rules-based Access Controls to Phoenix. This was a long-requested feature set, and we're excited for the new uses of Phoenix this will unlock!
The auth feature set includes:
🔒 Secure Access: All of Phoenix’s UI & APIs (REST, GraphQL, gRPC) now require access tokens or API keys. Keep your data safe!
👥 RBAC (Role-Based Access Control): Admins can manage users; members can update their profiles—simple & secure.
🔑 API Keys: Now available for seamless, secure data ingestion & querying.
🌐 OAuth2 Support: Easily integrate with Google, AWS Cognito, or Auth0. ✉ Password Resets via SMTP to make security a breeze.
For all the details on authentication, view our docs.
Available in Phoenix 4.11.0+
Our integration with Guardrails AI allows you to capture traces on guard usage and create datasets based on these traces. This integration is designed to enhance the safety and reliability of your LLM applications, ensuring they adhere to predefined rules and guidelines.
Check out the Cookbook here.
Phoenix is now available for deployment as a fully hosted service.
In addition to our existing notebook, CLI, and self-hosted deployment options, we’re excited to announce that Phoenix is now available as a fully hosted service.
With hosted instances, your data is stored between sessions, and you can easily share your work with team members.
We are partnering with LlamaIndex to power a new observability platform in LlamaCloud: LlamaTrace. LlamaTrace will automatically capture traces emitted from your LlamaIndex applications, and store them in a persistent, cloud- accessible Phoenix instance.
Hosted Phoenix is 100% free-to-use, check it out today!
Available in Phoenix 4.6+
Datasets 📊: Datasets are a new core feature in Phoenix that live alongside your projects. They can be imported, exported, created, curated, manipulated, and viewed within the platform, and should make a few flows much easier:
Fine-tuning: You can now create a dataset based on conditions in the UI, or by manually choosing examples, then export these into csv or jsonl formats readymade for fine-tuning APIs.
Experimentation: External datasets can be uploaded into Phoenix to serve as the test cases for experiments run in the platform.
For more details on using datasets see our documentation or example notebook.
Experiments 🧪: Our new Datasets and Experiments feature enables you to create and manage datasets for rigorous testing and evaluation of your models. You can now run comprehensive experiments to measure and analyze the performance of your LLMs in various scenarios.
For more details, check out our full walkthrough.
Available in Phoenix 4.6+
We are introducing a new built-in function call evaluator that scores the function/tool-calling capabilities of your LLMs. This off-the-shelf evaluator will help you ensure that your models are not just generating text but also effectively interacting with tools and functions as intended.
This evaluator checks for issues arising from function routing, parameter extraction, and function generation.
Check out a full walkthrough of the evaluator.