03.13.2023

What's New

Data Upload: Google BigQuery Integration

Sync Google BigQuery's data warehouse with Arize to automatically monitor and analyze new model data. Learn how to configure BigQuery with Arize here.

Monitors: Setup Monitors Tab

Discover and enable a wide selection of monitors and metrics recommended for your model. Navigate to the Monitors Setup Tab within the monitors listing tab. From there, click on the "Enable" button to setup a monitor for your predictions or features.

Model Overview & Dimension Detail: Pre-production Dataset Support

Support for visualizing training and validation data on the model overview and dimension details page. This enables users without production data to visualize and better understand their data.

Ranking: Python SDK Single Record Support

Ingest a sample of your ranking model data record-by-record with Python SDK version 6.1.0. Learn more about Python Single Record initialization and parameters here and follow an example here.

# new class 
class RankingPredictionLabel(
    group_id: str # required
    rank: int # required
    score: Optional[float]
    label: Optional[str]
)

# new class 
class RankingActualLabel(
    relevance_labels: Optional[List[str]]
    relevance_score: Optional[float]
)
from arize.api import Client
from arize.utils.types import Environments, ModelTypes, Schema, RankingPredictionLabel, RankingActualLabel

pred_label = RankingPredictionLabel(
      group_id="A",
      rank=1,
      score=1.0,
      label="click",
  )
act_label = RankingActualLabel(
    relevance_labels=["click", "save"],
    relevance_score=0.5,
)

response = arize.log(
    model_id="demo-ranking-single-log",
    model_version="v1",
    environment=Environments.PRODUCTION,
    model_type=ModelTypes.RANKING,
    prediction_id="123",
    prediction_label=pred_label,
    actual_label=act_label,
    features=features,
)

Enhancements

Ranking: Dashboard Template

Track rank-aware evaluation metric fluctuations and analyze ranking slices with the ranking dashboard template. Learn more about ranking model here and dashboards here.

Embeddings: Cluster Distribution Chart

Quickly identify model drift by comparing the number of points between your primary dataset and your baseline. Learn more about clusters here.

Embeddings: Single Dataset Support

Support for ingesting a single dataset of either training, validation, or production data when sending embeddings data to Arize and viewing a single dataset on UMAP to gain insights into your data. Learn more about embeddings here.

Model Overview: Onboarding Checklist

Follow the steps on the onboarding checklist to finish setting up your model in Arize.

In The News

New Course Additions!

Arize's self-guided ML observability course offers modules on how to calculate and use some of the most common metrics in model monitoring – including for performance, drift, data quality, fairness, and service-level performance – as well as the latest techniques in monitoring embeddings and unstructured data.

New additions include:

  • KNN Algorithm: K nearest neighbor is a valuable part of any machine learning toolbox and is useful in a variety of real-world applications, from image classification to sentiment analysis. This piece gives a technical overview of the KNN algorithm and explains its practical implementation, applications, limitations, and areas for improvement.

  • JS Divergence KL Divergence: This deep dive covers what JS divergence and how it differs from KL Divergence, how to use JS Divergence in machine learning model measurement, how the unique approach of a mixture distribution helps with a common measurement issue, and more.

  • Feature Store: A neutral take on why machine learning teams are adopting feature stores, the major tools, and a few tips for how to monitor your feature store.

  • Tokenization: For unstructured text data and natural language processing (NLP) applications, tokenization is a core component of the processing pipeline.In this great primer, we take a closer look at what tokenization is, why it matters, how to use it, and some exciting applications in the real-world (including in generative AI!).

Arize AI completed the Payment Card Industry Data Security Standard (PCI DSS) 4.0 certification! Recently updated to address emerging threats and attack vectors, PCI DSS 4.0 is a global standard that provides a baseline of technical and operational requirements designed to protect account data.

AI Coffee Break: Why ChatGPT Fails: Explaining Language Model Limitations

A chat with Zippi’s Data Lead Valeria Gomes about the company’s origin story, best practices for building an ML practice in fintech, and why the company selected Arize as its ML observability partner.

Last updated