Inspect the inner-workings of your LLM Application using OpenInference Traces
Overview
Tracing is a powerful tool for understanding the behavior of your LLM application. Phoenix has best-in-class tracing, irregardless of what framework you use.
To get started with traces, you will first want to start a local Phoenix app.
In your Jupyter or Colab environment, run the following command to install.
pipinstallarize-phoenix[evals]
condainstall-cconda-forgearize-phoenix[evals]
To get started, launch the phoenix app.
import phoenix as pxsession = px.launch_app()
The above launches a Phoenix server that acts as a trace collector for any LLM application running locally.
🌍 To view the Phoenix app in your browser, visit https://z8rwookkcle1-496ff2e9c6d22116-6060-colab.googleusercontent.com/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix
The launch_app command will spit out a URL for you to view the Phoenix UI. You can access this url again at any time via the session.
Now that phoenix is up and running, you can now run a LlamaIndex or LangChain application OR just run the OpenAI API and debug your application as the traces stream in.
To use llama-index's one click, you must install the small integration first:
import osimport phoenix as pxfrom llama_index.core import ( Settings, VectorStoreIndex, SimpleDirectoryReader, set_global_handler,)from llama_index.embeddings.openai import OpenAIEmbeddingfrom llama_index.llms.openai import OpenAIos.environ["OPENAI_API_KEY"]="YOUR_OPENAI_API_KEY"# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:session = px.launch_app()# Once you have started a Phoenix server, you can start your LlamaIndex application and configure it to send traces to Phoenix. To do this, you will have to add configure Phoenix as the global handler
set_global_handler("arize_phoenix")# LlamaIndex application initialization may vary# depending on your applicationSettings.llm =OpenAI(model="gpt-4-turbo-preview")Settings.embed_model =OpenAIEmbedding(model="text-embedding-ada-002")# Load your data and create an index. Note you usually want to store your index in a persistent store like a database or the file system
documents =SimpleDirectoryReader("YOUR_DATA_DIRECTORY").load_data()index = VectorStoreIndex.from_documents(documents)query_engine = index.as_query_engine()# Query your LlamaIndex applicationquery_engine.query("What is the meaning of life?")query_engine.query("Why did the cow jump over the moon?")# View the traces in the Phoenix UIpx.active_session().url
See the LlamaIndex for the full details as well as support for older versions of LlamaIndex
from phoenix.trace.langchain import LangChainInstrumentorLangChainInstrumentor().instrument()# Initialize your LangChain application# This might vary on your use-case. An example Chain is shown belowfrom langchain.chains import RetrievalQAfrom langchain.chat_models import ChatOpenAIfrom langchain.embeddings import OpenAIEmbeddingsfrom langchain.retrievers import KNNRetrieverembeddings =OpenAIEmbeddings(model="text-embedding-ada-002")knn_retriever =KNNRetriever( index=vectors, texts=texts, embeddings=OpenAIEmbeddings(),)llm =ChatOpenAI(model_name="gpt-3.5-turbo")chain = RetrievalQA.from_chain_type( llm=llm, chain_type="map_reduce", retriever=knn_retriever,)# Execute the chainresponse = chain.run("What is OpenInference tracing?")
import osfrom openai import OpenAIfrom phoenix.trace.openai import OpenAIInstrumentor# Initialize OpenAI auto-instrumentationOpenAIInstrumentor().instrument()# Initialize an OpenAI client# note you must have the OPENAI_API_KEY environment variable setclient =OpenAI()# Define a conversation with a user messageconversation = [{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello, can you help me with something?"}]# Generate a response from the assistantresponse = client.chat.completions.create( model="gpt-3.5-turbo", messages=conversation,)# Extract and print the assistant's reply# The traces will be available in the Phoenix App for the above messsagesassistant_reply = response.choices[0].message.content
from phoenix.trace.openai.instrumentor import OpenAIInstrumentorfrom phoenix.trace.openai import OpenAIInstrumentorimport phoenix as pxpx.launch_app()OpenAIInstrumentor().instrument()
Once you've executed a sufficient number of queries (or chats) to your application, you can view the details of the UI by refreshing the browser url
Trace Datasets
Phoenix also support datasets that contain OpenInference trace data. This allows data from a LangChain and LlamaIndex running instance explored for analysis offline.
There are two ways to extract trace dataframes. The two ways for LangChain are described below.
# You can export a dataframe from the session# Note that you can apply a filter if you would like to export only a sub-set of spansdf = px.Client().get_spans_dataframe('span_kind == "RETRIEVER"')# Re-launch the app using the datapx.launch_app(trace=px.TraceDataset(df))
LLM Traces are a powerful way to troubleshoot and understand your application and can be leveraged to evaluate the quality of your application. For a full list of notebooks that illustrate this in full-color, please check out the notebooks section.