TypeScript API Python API Community GitHub Phoenix Cloud

Architecture

Learn how Phoenix fits into your ML stack and how to incorporate Phoenix into your workflows.

PreviousEmbeddings NextOpenInference

Last updated 1 year ago

Was this helpful?

#357: Update Phoenix Inferences Quickstart

Change request updated 1 year ago

Architecture

Learn how Phoenix fits into your ML stack and how to incorporate Phoenix into your workflows.

Phoenix is designed to run locally on a single server in conjunction with the Notebook.

Phoenix runs locally, close to your data, in an environment that interfaces to Notebook cells on the Notebook server. Designing Phoenix to run locally, enables fast iteration on top of local data.

How should I use Phoenix?

In order to use Phoenix:

Load data into pandas dataframe
Start Phoenix
1. Single dataframe
Investigate problems
(Optional) Export data

Load Data Into pandas:

Leverage SDK Embeddings and LLM Eval Generators:

Start Phoenix with DataFrames:

Phoenix is typically started in a notebook from which a local Phoenix server is kicked off. Two approaches can be taken to the overall use of Phoenix:

Single Dataset

In the case of a team that only wants to investigate a single dataset for exploratory data analysis (EDA), a single dataset instantiation of Phoenix can be used. In this scenario, a team is normally analyzing the data in an exploratory manner and is not doing A/B comparisons.

Two Datasets

A common use case in ML is for teams to have 2x datasets they are comparing such as: training vs production, model A vs model B, OR production time X vs production time Y, just to name a few. In this scenario there exists a primary and reference dataset. When using the primary and reference dataset, Phoenix supports drift analysis, embedding drift and many different A/B dataset comparisons.

Investigate Problems:

Once instantiated, teams can dive into Phoenix on a feature by feature basis, analyzing performance and tracking down issues.

Export Cluster:

Once an issue is found, the cluster can be exported back into a dataframe for further analysis. Clusters can be used to create groups of similar data points for use downstream, these include:

Finding Similar Examples
Monitoring
Steering Vectors / Steering Prompts

How Phoenix fits into the ML Stack

The above picture shows the use of Phoenix with a cloud observability system (this is not required). In this example the cloud observability system allows the easy download (or synchronization) of data to the Notebook typically based on model, batch, environment, and time ranges. Normally this download is done to analyze data at the tail end of troubleshooting workflow, or periodically to use the notebook environment to monitor your models.

Once in a notebook environment the downloaded data can power Observability workflows that are highly interactive. Phoenix can be used to find clusters of data problems and export those clusters back to the Observability platform for use in monitoring and active learning workflows.

Note: Data can also be downloaded from any data warehouse system for use in Phoenix without the requirement of a cloud ML observability solution.

In the first version of Phoenix it is assumed the data is available locally but we’ve also designed it with some broader visions in mind. For example, Phoenix was designed with a stateless metrics engine as a first class citizen, enabling any metrics checks to be run in any python data pipeline.

PreviousEmbeddings NextOpenInference

Last updated 1 year ago

Was this helpful?

Phoenix is designed to run locally on a single server in conjunction with the Notebook.

Phoenix runs locally, close to your data, in an environment that interfaces to Notebook cells on the Notebook server. Designing Phoenix to run locally, enables fast iteration on top of local data.

How should I use Phoenix?

In order to use Phoenix:

Load data into pandas dataframe
(Optional) Leverage embeddings and LLM eval generators
Start Phoenix
1. Single dataframe
2. (Optional) Two dataframes: primary and
Investigate problems
(Optional) Export data

Load Data Into pandas:

Phoenix currently requires pandas dataframes which can be downloaded from either an ML observability platform, a table or a raw log file. The data is assumed to be formatted in the format with a well defined column structure, normally including a set of inputs/features, outputs/predictions and ground truth.

Leverage SDK Embeddings and LLM Eval Generators:

The Phoenix library heavily uses as a method for data visualization and debugging. In order to use Phoenix with embeddings they can either be generated using an SDK call or they can be supplied by the user of the library. Phoenix supports embeddings for LLMs, Image, NLP, and tabular datasets.

Start Phoenix with DataFrames:

Phoenix is typically started in a notebook from which a local Phoenix server is kicked off. Two approaches can be taken to the overall use of Phoenix:

Single Dataset

Two Datasets

Investigate Problems:

Once instantiated, teams can dive into Phoenix on a feature by feature basis, analyzing performance and tracking down issues.

Export Cluster:

Once an issue is found, the cluster can be exported back into a dataframe for further analysis. Clusters can be used to create groups of similar data points for use downstream, these include:

Finding Similar Examples
Monitoring
Steering Vectors / Steering Prompts

How Phoenix fits into the ML Stack

Phoenix is designed to monitor, analyze and troubleshoot issues on top of your model data allowing for workflows all within a Notebook environment.

Note: Data can also be downloaded from any data warehouse system for use in Phoenix without the requirement of a cloud ML observability solution.