Ask or search…
K
Links

Quickstart

Learn how to get started using Arize!

Quickstart Overview

Arize integrates with your ML stack, no matter where your data is hosted

Step 1: Upload Sample Data Via SDK

!pip install arize
To easily get started, we'll prepare a simple Classification Model dataset from SciKit learn to send via the Python SDK. Install arize, import dependencies, and attach your SPACE_KEY and API_KEY
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments, Schema, Metrics
API_KEY = 'YOUR API KEY'
SPACE_KEY = 'YOUR SPACE KEY'
arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)

Step 2: Download Dataset

For this example, download the load_breast_cancer dataset, assign the dataset to a variable, and preview the data to better understand what we're working with.
from sklearn.datasets import load_breast_cancer
breast_cancer_dataset = load_breast_cancer()

Step 3: Extract Features, Predictions, and Actuals

The dataset contains all the information we need to create a Pandas dataframe. For any dataset, extract the features, predictions, and actuals data. For this example:
breast_cancer_features = breast_cancer_dataset['data'] # feature data
breast_cancer_feature_names = breast_cancer_dataset['feature_names'] # feature names
breast_cancer_targets = breast_cancer_dataset['target'] # actual data
breast_cancer_target_names = breast_cancer_dataset['target_names'] # actual labels
Assign breast_cancer_taget_names to their corresponding breast_cancer_targets to use as a human-comprehensible list of actual labels.
target_name_transcription = [] # this will become our list of actuals
for i in breast_cancer_targets:
target_name_transcription.append(breast_cancer_target_names[i])
Create a Pandas dataframe to use the Arize Python Pandas logger with our predefined features and actuals(target_name_transcription).
Note: We've duplicated the actual_label column to create a prediction_label column for simplicities sake. Data will not populate in the Arize platform without a record of prediction data.
import pandas as pd
df = pd.DataFrame(breast_cancer_features, columns=breast_cancer_feature_names)
df['actual_label'] = target_name_transcription
df['prediction_label'] = target_name_transcription
# this is optional, but makes this example more interesting in the platform
df['prediction_label'] = df['prediction_label'].iloc[::-1].reset_index(drop=True)

Step 4: Log Data to Arize

Define the Schema so Arize knows what your columns correspond to.
schema = Schema(
actual_label_column_name="actual_label",
prediction_label_column_name="prediction_label",
feature_column_names=[
'mean radius', 'mean texture', 'mean perimeter', 'mean area',
'mean smoothness', 'mean compactness', 'mean concavity',
'mean concave points', 'mean symmetry', 'mean fractal dimension',
'radius error', 'texture error', 'perimeter error', 'area error',
'smoothness error', 'compactness error', 'concavity error',
'concave points error', 'symmetry error',
'fractal dimension error', 'worst radius', 'worst texture',
'worst perimeter', 'worst area', 'worst smoothness',
'worst compactness', 'worst concavity', 'worst concave points',
'worst symmetry', 'worst fractal dimension'
]
)
Log the model data.
response = arize_client.log(
dataframe=df,
schema=schema,
model_id='breast_cancer_dataset',
model_version='v1',
model_type=ModelTypes.BINARY_CLASSIFICATION,
metrics_validation=[Metrics.CLASSIFICATION],
environment=Environments.PRODUCTION
)

Step 4.5: Get a cup of coffee

It usually takes ~10 minutes for Arize to populate data throughout the platform. We recommend grabbing a quick cup of coffee (or tea) in the meantime!

Step 5: Visualize Model Performance

Now that you've uploaded some data to Arize, check it out on the platform. Navigate to the 'Performance Tracing' tab within your model. Here, you'll see an interactive performance-over-time chart and a performance breakdown visualization.
Performance Breakdown & Performance Insights

Step 6: Setup One-Click Monitoring

Create monitors to keep an eye on key performance, drift, and data quality metrics. Navigate to the 'Monitors' tab and enable relevant prebuilt monitors for your use case.
Prebuilt monitors in the Monitor's Setup tab

Step 7: Relax (With Alerting Notifications On)!

Configure alerts on the 'Config' page within the monitor's tab to keep you posted when your model changes unexpectedly.
Use our various alerting integrations or alert via email

Extra Credit: Create A Dashboard

We get it - ML observability is a lot of fun! Keep an eye on key model health metrics with dashboards for a single pane of glass view of your model. Create a custom dashboard, use a pre-built template, and simply copy and paste the dashboard URL to share with your team!
Example dashbaord with key performance metrics

Up Next: Connect to Production Data Pipeline

Connect your Cloud Storage Blob or Data Warehouse to automatically sync model data with Arize!

Looking for more examples? Check out our examples page!