Frequently asked questions about the product
Arize natively supports tabular/structured data types (strings, floats, booleans, etc), as well as embedding support for NLP, Image, and other unstructured data types.
Arize can surface outlier/anomalous data through:
- Numeric Features: Arize will monitor outliers in numeric inputs ranges for your input data.
- Categorical Features: Arize will monitor outlier categories and the overall cardinality of categorical features.
If there are features slices that vary significantly from the set baseline distribution, Arize will alert you through drift detection monitors.
If there are outlier slices that are poorly performing, Arize’s feature performance heatmap will automatically surface up the worst performing segments. These slices can also be monitored explicitly for proactively performance degradation detection.
Arize supports a comprehensive list of model performance metrics for both numeric and categorical model types. These metrics are available on dashboards as well as monitors. In addition to the out of the gate metrics listed below, Arize also supports model data metrics, custom evaluation metrics, and user defined business impact metrics. Learn more about statistical widgets here and user-defined business impact formulas here.
In addition to performance metrics, we also support data metrics that allow you to count, average, view percentiles, or calculate percent/count for all features, actuals, and/or predictions. All metrics can be calculated in aggregate, as well as on particular cohorts using applied filters.
You can monitor the performance of the model for that particular feature, feature-value combination —also known as a slice. This feature performance heatmap helps visualize the performance of each slice and indicates what slices are the most problematic/performance degrading.
Arize drift detection can flag when categorical features see a % of unseen categories. For example, if the baseline had 10 categories, but the production/serving distribution differed significantly in number, Arize will trigger an alert. Additionally, Arize captures the percentage of values that fall into these new feature categories not previously seen in the baseline distribution.
Arize drift detection can show the % of values outside of the baseline range. Arize uses the quantiles of the data to calculate the bins of the distribution. If the baseline range has a larger range than the production/serving environment, the user can see the % of volume where the baseline distribution was outside of the production/serving distribution. If the production/serving distribution was outside the range of the baseline distribution, similarly Arize surfaces the % of volume for values outside the baseline range.
Arize calculates drift metrics such as Population Stability Index, KL Divergence, and Wasserstein Distance. Arize computes drift by measuring distribution changes between the model’s production values and a baseline (reference dataset). Users can configure a baseline to be any time window of a:
- 1.Pre-production dataset (training, test, validation) or
- 2.Fixed or moving time period from production (e.g. last 30 days, last 60 days).
Arize supports automated schema detection of models and immediately computes statistics for all features of the model, including:
Arize supports feature quality metrics including feature drift, data quality (ex: cardinality, percent empty, type mismatch, out of range, etc.) and feature importance metrics. Additionally, users can compute performance metrics for their model filtered by feature/value combinations (slices).
Concept drift is drift in the actuals or ground truth. To measure concept drift, Arize requires historical actuals which are utilized to set a baseline.
Arize calculates the bins within the drift tab using quantiles and fixed bins from the baseline distribution.
The range between two quantile values in the baseline distribution are utilized to calculate a fixed width for binning. That fixed width value will be used to calculate a finite set of bins (currently 8) of a fixed width from the Median value, in both directions (4 in each direction). Lastly, it adds bins to the "bookends", one from min value to lowest bin's edge and another from largest bin's edge to maximum value amongst both distributions.
This strategy optimizes for reasonable sized bins by calculating a fixed width based on quantile values.
Arize is SOC2 Type 2 certified under standards set by the American Institute of Certified Public Accountants (AICPA). Arize’s SOC 2 security certification validates that Arize has adequate processes and policies to securely handle both customer and organizational data.
Arize AI has also received certifications from an independent auditor validating that the company’s health information security program is fairly represented and includes the essential elements of HIPAA’s Security Rule and the HITECH Act! Read more here.
Drift: An automatic threshold for drift metrics is defined as 2 standard deviations above the mean of the calculated metric value for the latest 14 days of data (with up to a 3 day delay).
Performance: An automatic threshold for performance metrics is defined as 2 standard deviations above/below (depending on the metric) the mean of the calculated metric value for the latest 14 days of data (with up to a 16 day delay).
That helper function is for the real-time logger where it returns a
future. The pandas logger,
from arize.pandas.logger import Client, just returns a response so you can check status with
Yes, use the Arize date range selector to select a date range less than 3 days, the platform will then switch to hourly.
Yes, our current implementation of vector drift looks at the vector drift as a whole, however, you could log the feature space that generated the n-dimensional feature to determine the prediction drift impact.