Search…
⌃K
Links

Python Pandas (batch)

Batch Logging - Designed for sending batches of data to Arize
Use the arize Python library to monitor machine learning predictions with a few lines of code in a Jupyter Notebook or a python server that batch processes backend data.
The most commonly used functions/objects are:
​Client — Initialize to to begin logging model data to Arize
​Schema — Organize and map column names containing model data within your Pandas dataframe.
​log — Log inferences within a dataframe to Arize via a POST request.

Python Pandas Example

For examples and interactive notebooks, see https://docs.arize.com/arize/examples​
# install and import dependencies
!pip install -q arize
​
import datetime
​
from arize.pandas.logger import Client
from arize.utils.types import ModelTypes, Environments, Schema, Metrics
import numpy as np
import pandas as pd
​
# create Arize client
SPACE_KEY = "SPACE_KEY"
API_KEY = "API_KEY"
arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)
​
#define schema
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="predicted_label",
actual_label_column_name="actual_label",
feature_column_names=feature_column_names
)
​
#log data
response = arize_client.log(
dataframe=df,
schema=schema,
model_id="binary-classification-metrics-only-batch-ingestion-tutorial",
model_version="1.0.0",
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
validate=True,
environment=Environments.PRODUCTION
)
Follow this example in Google Colab:

Install the Package

Arize SDK requires python >= 3.7
pip install arize #Install the Arize SDK
pip install arize[AutoEmbeddings] # Install extra dependencies to autogenerate embeddings

Initialize Arize Client

Initialize Arize Client , Schema , ModelTypes, Environments, and Metrics to begin logging a Pandas dataframe:
from arize.pandas.logger import Client
from arize.utils.types import ModelTypes, Environments, Schema, Metrics

Convert Mixed-Type Columns To Float

Data ingestion rejects datasets with mixed type columns. These columns should be converted to Float before sending. Below is an example of a mixed type column in Pandas an how to convert it.
import pandas as pd
​
# Example Series with mixed types
mixed = pd.Series([1, "", 2]) # it has numbers and strings
mixed.dtype # dtype('O')
​
# It should be converted to float
# Replace "" with NaN
mixed = mixed.replace("", float("NaN"))
mixed.dtype # dtype('float64')

Benchmark Tests

The ability to ingest data with low latency is important to many customers. Below is a benchmarking colab that demonstrates the efficiency with which Arize uploads data from a Python environment.
Sending 10 Million Inferences to Arize in 90 Seconds
​Colab Link​