Export Your Data
How to export your data for labeling, evaluation, or fine-tuning
Phoenix is designed to be a pre-production tool that can be used to find interesting or problematic data that can be used for various use-cases:
The easiest way to gather traces that have been collected by Phoenix is to directly pull a dataframe of the traces from your Phoenix
session
object.px.active_session().get_spans_dataframe('span_kind == "RETRIEVER"')
Notice that the
get_spans_dataframe
method supports a Python expression as an optional str
parameter so you can filter down your data to specific traces you care about. For full details, consult the Session API docs.You can also directly get the spans from the tracer or callback:
from phoenix.trace.langchain import OpenInferenceTracer
tracer = OpenInferenceTracer()
# Run the application with the tracer
chain.run(query, callbacks=[tracer])
# When you are ready to analyze the data, you can convert the traces
ds = TraceDataset.from_spans(tracer.get_spans())
# Print the dataframe
ds.dataframe.head()
# Re-initialize the app with the trace dataset
px.launch_app(trace=ds)
Note that the above calls
get_spans
on a LangChain tracer but the same exact method exists on the OpenInferenceCallback
for LlamaIndex as well.Embeddings can be extremely useful for fine-tuning. There are two ways to export your embeddings from the Phoenix UI.
To export a cluster (either selected via the lasso tool or via a the cluster list on the right hand panel), click on the export button on the top left of the bottom slide-out.
To export all clusters of embeddings as a single dataframe (labeled by cluster), click the
...
icon on the top right of the screen and click export. Your data will be available either as a Parquet file or is available back in your notebook via your session as a dataframe.session = px.active_session()
session.exports[-1].dataframe
Last modified 1mo ago