What is an Embedding?

What is an Embedding?

Embeddings are vector representations of data. Embeddings are everywhere in modern deep learning, such as transformers, recommendation engines, layers of deep neural networks, encoders, and decoders.
Embeddings are foundational because:
  1. 1.
    Embeddings can represent images, audio signals, and even large chunks of structured data.
  2. 2.
    They provide a common mathematical representation of your data
  3. 3.
    They compress your data
  4. 4.
    They preserve relationships within your data
  5. 5.
    They are the output of deep learning layers providing comprehensible linear views into complex non-linear relationships learned by models

Why Embeddings for Monitoring and Analyzing Deep Learning Models?

Data drift in unstructured data like images is complicated to measure. The measures typically used for drift in structured data allow for statistical analysis on structured labels but do not extend to unstructured data. The general challenge with measuring unstructured data drift is that you need to understand the change in relationships inside the unstructured data itself.

Additional Resources

Check out our tutorials on how to send embeddings to Arize for different use cases.
Getting Started: Quick Guides
Multi-Class Sentiment Classification
Named Entity Recognition
Image Classification
Learn more about embeddings and troubleshooting with Arize:
Questions? Email us at [email protected] or Slack us in the #arize-support channel