Export Your Data
How to export your data for labeling, evaluation, or fine-tuning
Phoenix is designed to be a pre-production tool that can be used to find interesting or problematic data that can be used for various use-cases:
The easiest way to gather traces that have been collected by Phoenix is to directly pull a dataframe of the traces from your Phoenix
px.active_session().get_spans_dataframe('span_kind == "RETRIEVER"')
Notice that the
get_spans_dataframemethod supports a Python expression as an optional
strparameter so you can filter down your data to specific traces you care about. For full details, consult the Session API docs.
You can also directly get the spans from the tracer or callback:
from phoenix.trace.langchain import OpenInferenceTracer
tracer = OpenInferenceTracer()
# Run the application with the tracer
# When you are ready to analyze the data, you can convert the traces
ds = TraceDataset.from_spans(tracer.get_spans())
# Print the dataframe
# Re-initialize the app with the trace dataset
Note that the above calls
get_spanson a LangChain tracer but the same exact method exists on the
OpenInferenceCallbackfor LlamaIndex as well.
Embeddings can be extremely useful for fine-tuning. There are two ways to export your embeddings from the Phoenix UI.
To export a cluster (either selected via the lasso tool or via a the cluster list on the right hand panel), click on the export button on the top left of the bottom slide-out.
To export all clusters of embeddings as a single dataframe (labeled by cluster), click the
...icon on the top right of the screen and click export. Your data will be available either as a Parquet file or is available back in your notebook via your session as a dataframe.
session = px.active_session()