Overview of how to use Arize for demand forecasting models
Using the Arize Dashboard function, you can quickly set-up a Mean Error monitor that notifies you when model Mean Error dips below a specific number. You can do this for any evaluation metric on Arize (i.e MAE, MSE, MAPE).
Monitoring Prediction Bias with Mean Error
Mean Error alone is not enough to tell the story. In a feature drift event where there is both over-prediction and under prediction, mean error could be cancelled out to be zero. This is why we will not only need monitors, but a side-by-side chart to allow us to compare the magnitude and direction of both Mean Absolute Error and Mean Error.
We can configure times series cards with custom model metrics on the Performance Dashboard, as shown below.
Creating a custom times series widget for Mean Error + Mean Absolute Error
We can also create a Data Metric times series widget to visualize mean values of our predictions vs. actuals to gain further insight into any potential biases.
Creating a Data Metric Times Series Widget
In the Arize Dashboard shown below, it's clear that we see an over prediction event first, then an under prediction event later. We can clearly see the magnitude of these errors based on our custom configuration.
Arize can also be used to triage your ML model performance. The model performance troubleshooting tools are designed by ML engineers for engineers to help you understand and solve your model performance issues.
Identifying features responsible for drift
Feature drifted during period of under-prediction
Not all feature drifts are inherently malignant and impact our model performances -- only some do.
With the insights provided on Arize, you can deep dive into root causes and quickly gain intuitions, allowing for ML teams to quickly iterate, experiment, and ship new models in production.
By visualizing the feature drift and understanding the features responsible, ML Engineers can gain additional information to work with when improving our models when troubleshooting model performance issues.
Some possible conclusions and action items our engineers could make might be...
- 1.Examining possible concept drifts relating to the features in question
- 2.Retraining our model to fit new distributions specific to this drift