Export Your Data
How to export your data for labeling, evaluation, or fine-tuning
Phoenix is designed to be a pre-production tool that can be used to find interesting or problematic data that can be used for various use-cases:
A subset of production data for re-labeling and training
A subset of data for fine-tuning an LLM
Exporting Traces
The easiest way to gather traces that have been collected by Phoenix is to directly pull a dataframe of the traces from your Phoenix session
object.
Notice that the get_spans_dataframe
method supports a Python expression as an optional str
parameter so you can filter down your data to specific traces you care about. For full details, consult the Session API docs.
You can also directly get the spans from the tracer or callback:
Note that the above calls get_spans
on a LangChain tracer but the same exact method exists on the OpenInferenceCallback
for LlamaIndex as well.
Exporting Embeddings
Embeddings can be extremely useful for fine-tuning. There are two ways to export your embeddings from the Phoenix UI.
Export Selected Clusters
To export a cluster (either selected via the lasso tool or via a the cluster list on the right hand panel), click on the export button on the top left of the bottom slide-out.
Export All Clusters
To export all clusters of embeddings as a single dataframe (labeled by cluster), click the ...
icon on the top right of the screen and click export. Your data will be available either as a Parquet file or is available back in your notebook via your session as a dataframe.
Last updated