Quickstart: Deployment

How to use phoenix outside of the notebook environment.

Phoenix's notebook-first approach to observability makes it a great tool to utilize during experimentation and pre-production. However at some point you are going to want to ship your application and continue to monitor your application as it runs.

In order to run Phoenix tracing in production, you will have to follow these following steps:

Setup a Server: your LLM application to run on a server
Instrument: Add OpenInference Instrumentation to your server
Observe: Run the Phoenix server as or a standalone instance and point your tracing instrumentation to the phoenix server

Looking for a working application? Jump to our Python and Javascript examples.

Setup a Server

Setting up a server to run your LLM application can be tricky to bootstrap. While bootstrapping an LLM application is not part of Phoenix, you can take a look at some of examples from our partners.

create-llama: A bootstrapping tool for setting up a full-stack LlamaIndex app
langchain-templates: Create a Langchain server using a template

Note that the above scripts and templates are provided purely as examples

Instrument

In order to make your LLM application observable, it must be instrumented. That is, the code must emit traces. The instrumented data must then be sent to an Observability backend, in our case the Phoenix server.

Phoenix collects traces from your running application using OTLP (OpenTelemetry Protocol). Notably, Phoenix accepts traces produced via instrumentation provided by OpenInference. OpenInference instrumentations automatically instrument your code so that LLM Traces can be exported and collected by Phoenix. To learn more about instrumentation, check out the full details here.

OpenInference currently supports instrumenting your application in both Python and Javascript. For each of these languages, you will first need to install the opentelemetry and openinference packages necessary to trace your application.

Install OpenTelemetry

For a comprehensive guide to python instrumentation, please consult OpenTelemetry's guide

Install OpenTelemetry packages

pip install opentelemetry-api opentelemetry-instrumentation opentelemetry-semantic-conventions opentelemetry-exporter-otlp-proto-http

For a comprehensive guide on instrumenting NodeJS using OpenTelemetry, consult their guide

npm install  @opentelemetry/exporter-trace-otlp-proto @opentelemetry/resources @opentelemetry/sdk-trace-node --save

Install OpenInference Instrumentations

To have your code produce LLM spans using OpenInference, you must pick the appropriate instrumentation packages and install them using a package manager. For a comprehensive list of instrumentations, checkout the OpenInference repository.

Initialize Instrumentation

In order for your application to export traces, it must be instrumented using OpenInference instrumentors. Note that instrumentation strategies differ by language so please consult OpenTelemetry's guidelines for full details.

Note that the below examples assume you are running phoenix via docker compose and thus simply have the URL http://phoenix:6006. If you are deploying phoenix separately, replace this string with the full URL of your running phoenix instance

Below is a example of what instrumentation might look like for LlamaIndex. instrument should be called before main is run in your server.

from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from opentelemetry import trace as trace_api
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor


def instrument():
    tracer_provider = trace_sdk.TracerProvider()
    span_exporter = OTLPSpanExporter("http://phoenix:6006/v1/traces")
    span_processor = SimpleSpanProcessor(span_exporter)
    tracer_provider.add_span_processor(span_processor)
    trace_api.set_tracer_provider(tracer_provider)
    LlamaIndexInstrumentor().instrument()

Code below is written in ESM format

For instrumentation to work with NodeJS to work, you must create a file instrumentation.js and have it run BEFORE all other server code in index.js

place the following code in a instrumentation.js file

import { registerInstrumentations } from "@opentelemetry/instrumentation";
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
import {
  ConsoleSpanExporter,
  SimpleSpanProcessor,
} from "@opentelemetry/sdk-trace-base";
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { Resource } from "@opentelemetry/resources";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-proto";
import { SemanticResourceAttributes } from "@opentelemetry/semantic-conventions";
import { diag, DiagConsoleLogger, DiagLogLevel } from "@opentelemetry/api";

// For troubleshooting, set the log level to DiagLogLevel.DEBUG
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);

const provider = new NodeTracerProvider({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: "openai-service",
  }),
});

provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.addSpanProcessor(
  new SimpleSpanProcessor(
    new OTLPTraceExporter({
      url: "http://localhost:6006/v1/traces",
    }),
  ),
);
provider.register();

registerInstrumentations({
  instrumentations: [new OpenAIInstrumentation({})],
});

console.log("👀 OpenInference initialized");// Some code

Then make sure that this file is required before running the server.

node -r instrumentation.js index.js

Observe

Lastly, we must run the phoenix server so that our application can export spans to it. To do this, we recommend running phoenix via an image. Phoenix images are available via dockerhub.

Docker

In order to run the phoenix server, you will have to start the application. Below are a few examples of how you can run the application on your local machine.

Pull the image you would like to run

docker pull arizephoenix/phoenix

Pick an image you would like to run or simply run the latest:

Note, you should pin the phoenix version for production to the version of phoenix you plan on using. E.x. arizephoenix/phoenix:4.0.0

docker run -p 6006:6006 -p 4317:4317 -i -t arizephoenix/phoenix:latest

See Portsfor details on the ports for the container.

For v5.2.0 or later:

phoenix serve

For pre-v5.2.0:

python -m phoenix.server.main serve

Note that the above simply starts the phoenix server locally. A simple way to make sure your application always has a running phoenix server as a collector is to run the phoenix server as a side car.

Here is an example compose.yaml

services:
  phoenix:
    image: arizephoenix/phoenix:latest
    ports:
      - "6006:6006"  # UI and OTLP HTTP collector
      - "4317:4317"  # OTLP gRPC collector
  backend:
    build:
      context: ./backend
      dockerfile: Dockerfile
      args:
        OPENAI_API_KEY: ${OPENAI_API_KEY}
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - COLLECTOR_ENDPOINT=http://phoenix:6006/v1/traces
      - PROD_CORS_ORIGIN=http://localhost:3000
      # Set INSTRUMENT_LLAMA_INDEX=false to disable instrumentation
      - INSTRUMENT_LLAMA_INDEX=true
    healthcheck:
      test: ["CMD", "wget", "--spider", "http://0.0.0.0:8000/api/chat/healthcheck"]
      interval: 5s
      timeout: 1s
      retries: 5
  frontend:
    build: frontend
    ports:
      - "3000:3000"
    depends_on:
      backend:
        condition: service_healthy

This way you will always have a running Phoenix instance when you run

docker compose up

For the full details of on how to configure Phoenix, check out the Configuration section

PreviousSelf-hosting NextEnvironments

Last updated 1 month ago