LogoLogo
Python SDKSlack
  • Documentation
  • Cookbooks
  • Self-Hosting
  • Release Notes
  • Reference
  • Arize AI
  • Quickstarts
  • ✨Arize Copilot
  • Arize AI for Agents
  • Concepts
    • Agent Evaluation
    • Tracing
      • What is OpenTelemetry?
      • What is OpenInference?
      • Openinference Semantic Conventions
    • Evaluation
  • 🧪Develop
    • Quickstart: Experiments
    • Datasets
      • Create a dataset
      • Update a dataset
      • Export a dataset
    • Experiments
      • Run experiments
      • Run experiments with code
        • Experiments SDK differences in AX vs Phoenix
        • Log experiment results via SDK
      • Evaluate experiments
      • Evaluate experiment with code
      • CI/CD with experiments
        • Github Action Basics
        • Gitlab CI/CD Basics
      • Download experiment
    • Prompt Playground
      • Use tool calling
      • Use image inputs
      • Replay spans
      • Compare prompts side-by-side
      • Load a dataset into playground
      • Save playground outputs as an experiment
      • ✨Copilot: prompt builder
    • Playground Integrations
      • OpenAI
      • Azure OpenAI
      • AWS Bedrock
      • VertexAI
      • Custom LLM Models
    • Prompt Hub
  • 🧠Evaluate
    • Online Evals
      • Run evaluations in the UI
      • Run evaluations with code
      • Test LLM evaluator in playground
      • View task details & logs
      • ✨Copilot: Eval Builder
      • ✨Copilot: Eval Analysis
      • ✨Copilot: RAG Analysis
    • Experiment Evals
    • LLM as a Judge
      • Custom Eval Templates
      • Arize Templates
        • Agent Tool Calling
        • Agent Tool Selection
        • Agent Parameter Extraction
        • Agent Path Convergence
        • Agent Planning
        • Agent Reflection
        • Hallucinations
        • Q&A on Retrieved Data
        • Summarization
        • Code Generation
        • Toxicity
        • AI vs Human (Groundtruth)
        • Citation
        • User Frustration
        • SQL Generation
    • Code Evaluations
    • Human Annotations
  • 🔭Observe
    • Quickstart: Tracing
    • Tracing
      • Setup tracing
      • Trace manually
        • Trace inputs and outputs
        • Trace function calls
        • Trace LLM, Retriever and Tool Spans
        • Trace prompt templates & variables
        • Trace as Inferences
        • Send Traces from Phoenix -> Arize
        • Advanced Tracing (OTEL) Examples
      • Add metadata
        • Add events, exceptions and status
        • Logging Latent Metadata
        • Add attributes, metadata and tags
        • Send data to a specific project
        • Get the current span context and tracer
      • Configure tracing options
        • Configure OTEL tracer
        • Mask span attributes
        • Redact sensitive data from traces
        • Instrument with OpenInference helpers
      • Query traces
        • Filter Traces
          • Time Filtering
        • Export Traces
        • ✨AI Powered Search & Filter
        • ✨AI Powered Trace Analysis
        • ✨AI Span Analysis & Evaluation
    • Tracing Integrations
      • OpenAI
      • OpenAI Agents SDK
      • LlamaIndex
      • LlamaIndex Workflows
      • LangChain
      • LangGraph
      • Hugging Face smolagents
      • Autogen
      • Google GenAI (Gemini)
      • Model Context Protocol (MCP)
      • Vertex AI
      • Amazon Bedrock
      • Amazon Bedrock Agents
      • MistralAI
      • Anthropic
      • LangFlow
      • Haystack
      • LiteLLM
      • CrewAI
      • Groq
      • DSPy
      • Guardrails AI
      • Prompt flow
      • Vercel AI SDK
      • Llama
      • Together AI
      • OpenTelemetry (arize-otel)
      • BeeAI
    • Evals on Traces
    • Guardrails
    • Sessions
    • Dashboards
      • Dashboard Widgets
      • Tracking Token Usage
      • ✨Copilot: Dashboard Widget Creation
    • Monitors
      • Integrations: Monitors
        • Slack
          • Manual Setup
        • OpsGenie
        • PagerDuty
      • LLM Red Teaming
    • Custom Metrics & Analytics
      • Arize Query Language Syntax
        • Conditionals and Filters
        • All Operators
        • All Functions
      • Custom Metric Examples
      • ✨Copilot: ArizeQL Generator
  • 📈Machine Learning
    • Machine Learning
      • User Guide: ML
      • Quickstart: ML
      • Concepts: ML
        • What Is A Model Schema
        • Delayed Actuals and Tags
        • ML Glossary
      • How To: ML
        • Upload Data to Arize
          • Pandas SDK Example
          • Local File Upload
            • File Upload FAQ
          • Table Ingestion Tuning
          • Wildcard Paths for Cloud Storage
          • Troubleshoot Data Upload
          • Sending Data FAQ
        • Monitors
          • ML Monitor Types
          • Configure Monitors
            • Notifications Providers
          • Programmatically Create Monitors
          • Best Practices for Monitors
        • Dashboards
          • Dashboard Widgets
          • Dashboard Templates
            • Model Performance
            • Pre-Production Performance
            • Feature Analysis
            • Drift
          • Programmatically Create Dashboards
        • Performance Tracing
          • Time Filtering
          • ✨Copilot: Performance Insights
        • Drift Tracing
          • ✨Copilot: Drift Insights
          • Data Distribution Visualization
          • Embeddings for Tabular Data (Multivariate Drift)
        • Custom Metrics
          • Arize Query Language Syntax
            • Conditionals and Filters
            • All Operators
            • All Functions
          • Custom Metric Examples
          • Custom Metrics Query Language
          • ✨Copilot: ArizeQL Generator
        • Troubleshoot Data Quality
          • ✨Copilot: Data Quality Insights
        • Explainability
          • Interpreting & Analyzing Feature Importance Values
          • SHAP
          • Surrogate Model
          • Explainability FAQ
          • Model Explainability
        • Bias Tracing (Fairness)
        • Export Data to Notebook
        • Automate Model Retraining
        • ML FAQ
      • Use Cases: ML
        • Binary Classification
          • Fraud
          • Insurance
        • Multi-Class Classification
        • Regression
          • Lending
          • Customer Lifetime Value
          • Click-Through Rate
        • Timeseries Forecasting
          • Demand Forecasting
          • Churn Forecasting
        • Ranking
          • Collaborative Filtering
          • Search Ranking
        • Natural Language Processing (NLP)
        • Common Industry Use Cases
      • Integrations: ML
        • Google BigQuery
          • GBQ Views
          • Google BigQuery FAQ
        • Snowflake
          • Snowflake Permissions Configuration
        • Databricks
        • Google Cloud Storage (GCS)
        • Azure Blob Storage
        • AWS S3
          • Private Image Link Access Via AWS S3
        • Kafka
        • Airflow Retrain
        • Amazon EventBridge Retrain
        • MLOps Partners
          • Algorithmia
          • Anyscale
          • Azure & Databricks
          • BentoML
          • CML (DVC)
          • Deepnote
          • Feast
          • Google Cloud ML
          • Hugging Face
          • LangChain 🦜🔗
          • MLflow
          • Neptune
          • Paperspace
          • PySpark
          • Ray Serve (Anyscale)
          • SageMaker
            • Batch
            • RealTime
            • Notebook Instance with Greater than 20GB of Data
          • Spell
          • UbiOps
          • Weights & Biases
      • API Reference: ML
        • Python SDK
          • Pandas Batch Logging
            • Client
            • log
            • Schema
            • TypedColumns
            • EmbeddingColumnNames
            • ObjectDetectionColumnNames
            • PromptTemplateColumnNames
            • LLMConfigColumnNames
            • LLMRunMetadataColumnNames
            • NLP_Metrics
            • AutoEmbeddings
            • utils.types.ModelTypes
            • utils.types.Metrics
            • utils.types.Environments
          • Single Record Logging
            • Client
            • log
            • TypedValue
            • Ranking
            • Multi-Class
            • Object Detection
            • Embedding
            • LLMRunMetadata
            • utils.types.ModelTypes
            • utils.types.Metrics
            • utils.types.Environments
        • Java SDK
          • Constructor
          • log
          • bulkLog
          • logValidationRecords
          • logTrainingRecords
        • R SDK
          • Client$new()
          • Client$log()
        • Rest API
    • Computer Vision
      • How to: CV
        • Generate Embeddings
          • How to Generate Your Own Embedding
          • Let Arize Generate Your Embeddings
        • Embedding & Cluster Analyzer
        • ✨Copilot: Embedding Summarization
        • Similarity Search
        • Embedding Drift
        • Embeddings FAQ
      • Integrations: CV
      • Use Cases: CV
        • Image Classification
        • Image Segmentation
        • Object Detection
      • API Reference: CV
Powered by GitBook

Support

  • Chat Us On Slack
  • support@arize.com

Get Started

  • Signup For Free
  • Book A Demo

Copyright © 2025 Arize AI, Inc

On this page
  • What is an Embedding Projector ?
  • How to use UMAP?
  • How to Colorize and Filter Points in UMAP
  • How to Configure your UMAP Generation?
  • What is Clustering?
  • How does Arize cluster the UMAP points?
  • How can clustering help me?
  • Cluster Metrics
  • How do I use clusters to improve my model?

Was this helpful?

  1. Machine Learning
  2. Computer Vision
  3. How to: CV

Embedding & Cluster Analyzer

Last updated 1 year ago

Was this helpful?

What is an Embedding Projector ?

Embedding Projectors are a great tool in visualizing and interpreting embeddings. In order to do so, we have to apply an algorithm to reduce the dimensionality of the embeddings to 2D/3D. UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that belongs to the neighbor graph category. Arize uses it to create a lower dimensional representation (2D or 3D) of your dataset, represented by .

Learn more about UMAP in comparison to other neighbor graph algorithms .

How to use UMAP?

  1. Select your data and Generate UMAP

    • Select a point on the drift visualization at the top of the page and generate UMAP to visualize the point in time selected

  2. Investigate your worst drifted clusters

    • Clusters are groups of related points in the point cloud. The closer points or clusters are to each other, the more similar they are. Clusters allow you to easily visualize data that differs from your baseline but also allows you to visualize the global and local structure of your data.

  3. Investigate the data points that belong in that cluster

  4. Colorize/Filter your data

    UMAP enables users to identify patterns or the structure in the data to explain where model can be improved by applying colorizations and filters

How to Colorize and Filter Points in UMAP

Users can colorize and filter the UMAP visualization:

  • By Dataset: points will be colored based on if they belong to the baseline or primary dataset.

  • By Prediction Label/Score: points will be colored according to the prediction label/score obtained from the model.

  • By Actual Label/Score: points will be colored according to the actual label/score obtained from the model.

  • By Correctness: points will be colored based on whether or not the prediction was correct (i.e., does the prediction label match the actual label).

  • By Confusion Matrix: after selecting a positive class, points will be colored by their confusion matrix value (true positive, true negative, false positive, or false negative).

  • By Tag: identify patterns or insights in slices of data by choosing to color by tag.

  • By Feature: Identify patterns or insights in slices of data by choosing to color by feature.

How to Configure your UMAP Generation?

Users can configure their UMAP generation by these parameters:

  • Dimensions: choose between a 2D or a 3D plot.

  • Sample size: the number of points in the UMAP plot per dataset. For example, if you select 500 there will be 1000 points total in the plot. Allowed values range from 300 to 2500.

What is Clustering?

Clustering is the process of grouping similar data points together. The goal of clustering is to find patterns and structure in a data set and to divide the data points into groups, or clusters, that share certain characteristics.

How does Arize cluster the UMAP points?

Our clustering algorithm is an unsupervised learning technique, which means that it works on unlabelled data and finds patterns on its own.

How can clustering help me?

Clusters help you find patterns and structure in your dataset. Users are able to troubleshoot performance degradation by examining the underlaying data in the form of clusters and use these insights to improve your models performance.

Examples:

  • you might realize that your model is confusing two classes that are similar (i.e sandals and sneakers)

  • you have a cluster with a drift score close to -1, meaning that model is seeing production data that is unlike the training

The drift score measures the reference data coverage present in a given cluster or point cloud. A score of -1 means that the cluster only contains primary, or production, data. A score of 1 means that the cluster only contains only baseline data. A score of 0 means the cluster is equally composed of baseline and primary data. The white and blue bars represent the count in each dataset in that cluster.

Cluster Metrics

After choosing your desired cluster metric (e.g. euclidean distance, accuracy, custom metric, etc.), Arize automatically surfaces the clusters you should focus on for model improvement / troubleshooting so you can quickly find the root cause.

You can select the metric you want to use, and how you want the clusters to be sorted.

How do I use clusters to improve my model?

Download Clusters Locally

Once a cluster that is impacting model performance has been identified, users can download the data in the cluster for active learning. This data includes all the information needed for labeling workflows. These clusters are highly focused groups of datapoints, enabling labeling teams to be more precise in their efforts.

Export Data into Notebook Environment

Arize enables teams to continue analysis of their production data in notebooks.

Learn more here.

nNeighbors: controls how UMAP balances local versus global structure in the data. More specifically, it controls the definition of the local region, i.e., how many neighbors UMAP will look at to define a local region. It balances the focus from global to local structure. The lower/higher the value of nNeighbors, the more focus we put on the local/global structure of the dataset. Allowed values range from 5 to 100. Learn more .

minDist: provides the minimum distance apart that points are allowed to be in the low dimensional representation. Allowed values range from 0.0 to 0.99. Learn more .

With a few lines of Python code, users can export their data into or a Jupyter notebook for further analysis.

📈
here
here
embedding vectors
here
Phoenix
You can select a specific cluster to further investigate and view the data associated with the cluster
When you select a cluster, Arize surfaces the data associated (shown on the right) and you can further investigate any data point by clicking "View Details"
In this example, the cluster contains 16 points in production, and 55 points in the baseline dataset.
Arize automatically surfaces clusters of bad responses to focus on improvement
Select how to sort the clusters by evaluation score and dataset
Download cluster of datapoints
Example CSV
Export to Phoenix