Log Evaluations to Arize

Use the log_evaluations function as part of our Python SDK to attach evaluations you've run to traces. The code below assumes that you have already completed an evaluation run, and you have the evals_dataframe object. It also assumes you have a traces_dataframe object to get the span_id that you need to attach the evals.

The evals_dataframe requires four columns, which should be auto-generated for you based on the evaluation you ran using Phoenix.

  • eval.<eval_name>.label

  • eval.<eval_name>.score

  • eval.<eval_name>.explanation

  • context.span_id

Currently, our evaluations are logged within Arize every 10 minutes, and we're working on making them as close to instant as possible! Reach out to support@arize.com if you're having trouble here.

import os
from arize.pandas.logger import Client

API_KEY = os.environ.get("ARIZE_API_KEY")
SPACE_ID = os.environ.get("ARIZE_SPACE_ID")

# Initialize Arize client using the model_id of your traces
arize_client = Client(space_id=SPACE_ID, api_key=API_KEY)
model_id = "quickstart-llm-tutorial"

# Set the evals_df to have the correct span ID to log it to Arize
evals_dataframe = evals_dataframe.set_index(traces_dataframe["context.span_id"])

# Use Arize client to log evaluations
response = arize_client.log_evaluations(
    dataframe=evals_dataframe,
    model_id=model_id,
)

# If successful, the server will return a status_code of 200
if response.status_code != 200:
    print(f"❌ logging failed with response code {response.status_code}, {response.text}")
else:
    print(f"✅ You have successfully logged evaluations to Arize")

Last updated

Copyright © 2023 Arize AI, Inc