Quickstart: Tracing

Inspect the inner-workings of your LLM Application using OpenInference Traces

Overview

Tracing is a powerful tool for understanding the behavior of your LLM application. Phoenix has best-in-class tracing, irregardless of what framework you use and has first-class instrumentation for a variety of frameworks ( LlamaIndex, LangChain, DSPy), SDKs (OpenAI, Bedrock, Mistral, Vertex), and Languages (Python, Javascript). You can also manually instrument your application using the OpenTelemetry SDK.

To get started with traces, you will first want to start a local Phoenix app. Below we will explore how to use Phoenix in a notebook but you can deploy phoenix once you are ready for a persistent observability platform.

In your Jupyter or Colab environment, run the following command to install.

pip install arize-phoenix

To get started, launch the phoenix app.

import phoenix as px
session = px.launch_app()

The above launches a Phoenix server that acts as a trace collector for any LLM application running locally in you jupyter notebook!

🌍 To view the Phoenix app in your browser, visit https://z8rwookkcle1-496ff2e9c6d22116-6060-colab.googleusercontent.com/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix

The launch_app command will spit out a URL for you to view the Phoenix UI. You can access this url again at any time via the session. Now that phoenix is up and running, you can setup tracing for your AI application so that you can debug your application as the traces stream in.

To use llama-index's one click, you must install the small integration first:

pip install 'llama-index>=0.10.44'
import phoenix as px
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
import os
from gcsfs import GCSFileSystem
from llama_index.core import (
    Settings,
    VectorStoreIndex,
    StorageContext,
    set_global_handler,
    load_index_from_storage
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import llama_index

# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:
session = px.launch_app()

# Initialize LlamaIndex auto-instrumentation
LlamaIndexInstrumentor().instrument()

os.environ["OPENAI_API_KEY"] = "<ENTER_YOUR_OPENAI_API_KEY_HERE>"

# LlamaIndex application initialization may vary
# depending on your application
Settings.llm = OpenAI(model="gpt-4-turbo-preview")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")


# Load your data and create an index. Here we've provided an example of our documentation
file_system = GCSFileSystem(project="public-assets-275721")
index_path = "arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/"
storage_context = StorageContext.from_defaults(
    fs=file_system,
    persist_dir=index_path,
)

index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Query your LlamaIndex application
query_engine.query("What is the meaning of life?")
query_engine.query("How can I deploy Arize?")

# View the traces in the Phoenix UI
px.active_session().url

See the LlamaIndex for the full details as well as support for older versions of LlamaIndex

Once you've executed a sufficient number of queries (or chats) to your application, you can view the details of the UI by refreshing the browser url

Exporting Traces from Phoenix

# You can export a dataframe from the session
df = px.Client().get_spans_dataframe()

# Note that you can apply a filter if you would like to export only a sub-set of spans
df = px.Client().get_spans_dataframe('span_kind == "RETRIEVER"')

For full details on how to export trace data, see the detailed guide

Evaluating Traces

In addition to launching phoenix on LlamaIndex and LangChain, teams can export trace data to a dataframe in order to run LLM Evals on the data.

Learn more in the evals quickstart.

Conclusion

LLM Traces are a powerful way to troubleshoot and understand your application and can be leveraged to evaluate the quality of your application. For a full list of notebooks that illustrate this in full-color, please check out the notebooks section.

Last updated