This example runs through setting up a collaborative filtering ranking model in the Arize platform. A collaborative filtering model is normally used in recommendation engines to produce ranked personalized recommendations. This example works through a ranking recommendation based on predicting the user's 5-star rating of an unseen movie. Learn more here.
The above picture represents a typical collaborative filtering scenario. A matrix is built based on activity, in this case, people and products they rate. The prediction of similar objects is typically based on distance (similarity) metric that can be cosine or euclidean distance.
The common data tracked includes:
- Timestamp: The date/time of the prediction/recommendation event.
- Prediction ID: A unique ID for a single prediction within a ranked list.
- Group ID: The group ID, such as a user or category, that identifies an entire ranked list.
- Relevancy Score: A score capturing the actual relevance of a prediction.
- Relevancy Labels: The truth label such as "purchase" or "click".
- Rank: a numeric representation of the order of the prediction in the ranked list.
- Prediction score: The predicted score used to rank the list of predicted recommendations.
The common metrics for Collaborative Filtering are:
- LogLoss (for click and not-click)
- MAE, MAPE, RMSE (for rating prediction)
- NDCG or Recall @ k