Quickstart: ML

Learn how to get started using Arize!

Step 1: Install Arize

!pip install arize

Install arize, import dependencies, and attach your SPACE_KEY and API_KEY

from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments, Schema, Metrics

API_KEY = 'YOUR API KEY'
SPACE_ID = 'YOUR SPACE ID'
arize_client = Client(space_id=SPACE_ID, api_key=API_KEY)

Step 2: Download Dataset

To easily get started, we'll prepare a simple Classification Model dataset from SciKit learn to send via the Python SDK. For this example, download the load_breast_cancer dataset, assign the dataset to a variable, and preview the data to better understand what we're working with.

from sklearn.datasets import load_breast_cancer
breast_cancer_dataset = load_breast_cancer()

Step 3: Extract Features, Predictions, and Actuals

The dataset contains all the information we need to create a Pandas dataframe. For any dataset, extract the features, predictions, and actuals data. For this example:

breast_cancer_features = breast_cancer_dataset['data'] # feature data
breast_cancer_feature_names = breast_cancer_dataset['feature_names'] # feature names
breast_cancer_targets = breast_cancer_dataset['target'] # actual data
breast_cancer_target_names = breast_cancer_dataset['target_names'] # actual labels

Assign breast_cancer_taget_names to their corresponding breast_cancer_targets to use as a human-comprehensible list of actual labels.

target_name_transcription = [] # this will become our list of actuals

for i in breast_cancer_targets: 
  target_name_transcription.append(breast_cancer_target_names[i])

Create a Pandas dataframe to use the Arize Python Pandas logger with our predefined features and actuals(target_name_transcription).

Note: We've duplicated the actual_label column to create a prediction_label column for simplicities sake. Data will not populate in the Arize platform without a record of prediction data.

import pandas as pd

df = pd.DataFrame(breast_cancer_features, columns=breast_cancer_feature_names)
df['actual_label'] = target_name_transcription
df['prediction_label'] = target_name_transcription

# this is optional, but makes this example more interesting in the platform
df['prediction_label'] = df['prediction_label'].iloc[::-1].reset_index(drop=True) 

Step 4: Log Data to Arize

Define the Schema so Arize knows what your columns correspond to. Log the model data.

schema = Schema(
    actual_label_column_name="actual_label",
    prediction_label_column_name="prediction_label",
    feature_column_names=[
       'mean radius', 'mean texture', 'mean perimeter', 'mean area',
       'mean smoothness', 'mean compactness', 'mean concavity',
       'mean concave points', 'mean symmetry', 'mean fractal dimension',
       'radius error', 'texture error', 'perimeter error', 'area error',
       'smoothness error', 'compactness error', 'concavity error',
       'concave points error', 'symmetry error',
       'fractal dimension error', 'worst radius', 'worst texture',
       'worst perimeter', 'worst area', 'worst smoothness',
       'worst compactness', 'worst concavity', 'worst concave points',
       'worst symmetry', 'worst fractal dimension'
       ]
)

response = arize_client.log(
    dataframe=df,
    schema=schema,
    model_id='breast_cancer_dataset', 
    model_version='v1',
    model_type=ModelTypes.BINARY_CLASSIFICATION,
    metrics_validation=[Metrics.CLASSIFICATION], 
    environment=Environments.PRODUCTION
) 

Step 5: Visualize Model Performance

Now that you've uploaded some data to Arize, check it out on the platform. Navigate to the 'Performance Tracing' tab within your model. Here, you'll see an interactive performance-over-time chart and a performance breakdown visualization.

Step 6: Setup One-Click Monitoring

Create monitors to keep an eye on key performance, drift, and data quality metrics. Navigate to the 'Monitors' tab and enable relevant prebuilt monitors for your use case.

Step 7: Relax (With Alerting Notifications On)!

Configure alerts on the 'Config' page within the monitor's tab to keep you posted when your model changes unexpectedly.

Extra Credit: Create A Dashboard

We get it - ML observability is a lot of fun! Keep an eye on key model health metrics with dashboards for a single pane of glass view of your model. Create a custom dashboard, use a pre-built template, and simply copy and paste the dashboard URL to share with your team!

Up Next: Connect to Production Data Pipeline

Connect your Cloud Storage Blob or Data Warehouse to automatically sync model data with Arize!

Looking for more examples? Check out our examples page!

Examples

Last updated

Copyright © 2023 Arize AI, Inc