AutoEmbeddings

Install extra dependencies to generate embedding vectors

Install extra dependencies in the SDK:

pip install arize[AutoEmbeddings]

The EmbeddingGenerator Class

Arize class to generate embeddings data.

Import and initialize EmbeddingGenerator from arize.pandas.embeddings:

from arize.pandas.embeddings import EmbeddingGenerator

Methods

from_use_case

View Source

Pass in use_case and more options depending on the use case.

ArgumentDescription

use_case

UseCases.NLP.SEQUENCE_CLASSIFICATION or

UseCases.NLP.SUMMARIZATION or

UseCases.CV.IMAGE_CLASSIFICATION

model_name

Refer to Supported Models

list_pretrained_models

View Source

Returns updated table listing of supported models.

EmbeddingGenerator.list_pretrained_models()

Code Example

from arize.pandas.embeddings import EmbeddingGenerator, UseCases

# example CV
generator = EmbeddingGenerator.from_use_case(
    use_case=UseCases.CV.IMAGE_CLASSIFICATION,
    model_name="google/vit-base-patch16-224-in21k",
    batch_size=100
)
df["image_vector"] = generator.generate_embeddings(
    local_image_path_col=df["local_path"]
)

# example NLP
generator = EmbeddingGenerator.from_use_case(
    use_case=UseCases.NLP.SEQUENCE_CLASSIFICATION,
    model_name="distilbert-base-uncased",
    tokenizer_max_length=512,
    batch_size=100
)
df["text_vector"] = generator.generate_embeddings(text_col=df["text"])

Last updated

Copyright © 2023 Arize AI, Inc