AI Powered Data Quality Insights

Data Quality

Check for Missing Data

  • Suggested Prompt: "What dimensions have high percent empty?"

  • Use When: Identifying potential data quality issues that could impact model inputs

  • Description: Analyzes input data to report on the percentage of missing data in features and tags, highlighting any sudden spikes or changes.

Assess Feature Data Quality

  • Suggested Prompt: "What features have data quality issues?"

  • Use When: You want to understand data quality at a high level

  • Description: Assists machine learning engineers in debugging issues by conducting a detailed analysis of dataset metrics, focusing on dimensions related to the user's investigation. Identifies critical changes, such as drift or cardinality variations, and provides actionable suggestions to further investigate and resolve identified issues.

Evaluate Distribution Shifts

  • Suggested Prompt: "Analyze distribution shift for "

  • Use When: You want to understand a given dimension's distribution

  • Description: Analyzes a given dimension's distribution to understand which slices have had significant shifts in their percentage of the distribution.

  • Suggested Prompt: "Analyze changes in feature cardinality"

  • Use When: You want to analyze changes in the cardinality of your features

  • Description: Analyzes changes in the cardinality of features and tags over time, alerting to any unusual variations that might indicate data quality problems. Particularly valuable when direct performance metrics from the model are not available.

Last updated

Copyright © 2023 Arize AI, Inc