Querying Spans
Span queries help you extract data from your traces into DataFrames for evaluation
Connect to Phoenix
Before accessing px.Client(), be sure you've set the following environment variables:
If you're self-hosting Phoenix, ignore the client headers and change the collector endpoint to your endpoint.
How to Run a Query
You can query for data from the traces collected in Phoenix using the Client. To simply get DataFrames of spans, you can simply ask for a DataFrame. Each row of the DataFrame with be a span that matches the filter criteria and time range passed in. If you leave the parameters blank, you will get all the spans.
You can also query for data using our query DSL (domain specific language). Below is an example of how to pull all retriever spans and select the input value. The output of this query is a DataFrame that contains the input values for all retriever spans.
DataFrame Index
By default, the result DataFrame is indexed by span_id
, and if .explode()
is used, the index from the exploded list is added to create a multi-index on the result DataFrame. For the special retrieval.documents
span attribute, the added index is renamed as document_position
.
How to Specify a Time Range
By default, all queries will collect all spans that are in your Phoenix instance. If you'd like to focus on most recent spans, you can pull spans based on time frames using start_time
and end_time
.
How to Specify a Project
By default all queries are executed against the default project or the project set via the PHOENIX_PROJECT_NAME
environment variable. If you choose to pull from a different project, all methods on the Client have an optional parameter named project_name
Querying for Retrieved Documents
Let's say we want to extract the retrieved documents into a DataFrame that looks something like the table below, where input
denotes the query for the retriever, reference
denotes the content of each document, and document_position
denotes the (zero-based) index in each span's list of retrieved documents.
Note that this DataFrame can be used directly as input for the Retrieval (RAG) Relevance evaluations.
5B8EF798A381
0
What was the author's motivation for writing ...
In fact, I decided to write a book about ...
5B8EF798A381
1
What was the author's motivation for writing ...
I started writing essays again, and wrote a bunch of ...
...
...
...
...
E19B7EC3GG02
0
What did the author learn about ...
The good part was that I got paid huge amounts of ...
We can accomplish this with a simple query as follows. Also see Predefined Queries for a helper function executing this query.
How to Explode Attributes
In addition to the document content, if we also want to explode the document score, we can simply add the document.score
attribute to the .explode()
method alongside document.content
as follows. Keyword arguments are necessary to name the output columns, and in this example we name the output columns as reference
and score
. (Python's double-asterisk unpacking idiom can be used to specify arbitrary output names containing spaces or symbols. See here for an example.)
How to Apply Filters
The .where()
method accepts a string of valid Python boolean expression. The expression can be arbitrarily complex, but restrictions apply, e.g. making function calls are generally disallowed. Below is a conjunction filtering also on whether the input value contains the string 'programming'
.
Filtering Spans by Evaluation Results
Filtering spans by evaluation results, e.g. score
or label
, can be done via a special syntax. The name of the evaluation is specified as an indexer on the special keyword evals
. The example below filters for spans with the incorrect
label on their correctness
evaluations. (See here for how to compute evaluations for traces, and here for how to ingest those results back to Phoenix.)
Filtering on Metadata
metadata
is an attribute that is a dictionary and it can be filtered like a dictionary.
Filtering for Substring
Note that Python strings do not have a contain
method, and substring search is done with the in
operator.
Filtering for No Evaluations
Get spans that do not have an evaluation attached yet
How to Extract Attributes
Span attributes can be selected by simply listing them inside .select()
method.
Renaming Output Columns
Keyword-argument style can be used to rename the columns in the dataframe. The example below returns two columns named input
and output
instead of the original names of the attributes.
Arbitrary Output Column Names
If arbitrary output names are desired, e.g. names with spaces and symbols, we can leverage Python's double-asterisk idiom for unpacking a dictionary, as shown below.
Advanced Usage
Concatenating
The document contents can also be concatenated together. The query below concatenates the list of document.content
with \n
(double newlines), which is the default separator. Keyword arguments are necessary to name the output columns, and in this example we name the output column as reference
. (Python's double-asterisk unpacking idiom can be used to specify arbitrary output names containing spaces or symbols. See here for an example.)
Special Separators
If a different separator is desired, say \n************
, it can be specified as follows.
Using Parent ID as Index
This is useful for joining a span to its parent span. To do that we would first index the child span by selecting its parent ID and renaming it as span_id
. This works because span_id
is a special column name: whichever column having that name will become the index of the output DataFrame.
Joining a Span to Its Parent
To do this, we would provide two queries to Phoenix which will return two simultaneous dataframes that can be joined together by pandas. The query_for_child_spans
uses parent_id
as index as shown in Using Parent ID as Index, and px.Client().query_spans()
returns a list of dataframes when multiple queries are given.
How to use Data for Evaluation
Extract the Input and Output from LLM Spans
To learn more about extracting span attributes, see Extracting Span Attributes.
Retrieval (RAG) Relevance Evaluations
To extract the dataframe input for Retrieval (RAG) Relevance evaluations, we can apply the query described in the Example, or leverage the helper function implementing the same query.
Q&A on Retrieved Data Evaluations
To extract the dataframe input to the Q&A on Retrieved Data evaluations, we can use a helper function or use the following query (which is what's inside the helper function). This query applies techniques described in the Advanced Usage section.
Pre-defined Queries
Phoenix also provides helper functions that executes predefined queries for the following use cases.
If you need to run the query against a specific project, you can add the project_name
as a parameter to any of the pre-defined queries
Retrieved Documents
The query shown in the example can be done more simply with a helper function as follows. The output DataFrame can be used directly as input for the Retrieval (RAG) Relevance evaluations.
Q&A on Retrieved Data
To extract the dataframe input to the Q&A on Retrieved Data evaluations, we can use the following helper function.
The output DataFrame would look something like the one below. The input
contains contains the question, the output
column contains the answer, and the reference
column contains a concatenation of all the retrieved documents. This helper function assumes that the questions and answers are the input.value
and output.value
attributes of the root spans, and the list of retrieved documents are contained in a direct child span of the root span. (The helper function applies the techniques described in the Advanced Usage section.)
CDBC4CE34
What was the author's trick for ...
The author's trick for ...
Even then it took me several years to understand ...
...
...
...
...
Last updated