AutoEmbeddings
Install extra dependencies to generate embedding vectors
minimum required for Auto Embeddings
Install extra dependencies in the SDK:
pip install arize[AutoEmbeddings]
Import and initialize
EmbeddingGenerator
from arize.pandas.embeddings
:from arize.pandas.embeddings import EmbeddingGenerator
from_use_case
Pass in use_case and more options depending on the use case.
Argument | Description |
---|---|
use_case | UseCases.NLP.SEQUENCE_CLASSIFICATION or UseCases.NLP.SUMMARIZATION orUseCases.CV.IMAGE_CLASSIFICATION |
model_name |
list_pretrained_models
Returns updated table listing of supported models.
EmbeddingGenerator.list_pretrained_models()
from arize.pandas.embeddings import EmbeddingGenerator, UseCases
# example CV
generator = EmbeddingGenerator.from_use_case(
use_case=UseCases.CV.IMAGE_CLASSIFICATION,
model_name="google/vit-base-patch16-224-in21k",
batch_size=100
)
df["image_vector"] = generator.generate_embeddings(
local_image_path_col=df["local_path"]
)
# example NLP
generator = EmbeddingGenerator.from_use_case(
use_case=UseCases.NLP.SEQUENCE_CLASSIFICATION,
model_name="distilbert-base-uncased",
tokenizer_max_length=512,
batch_size=100
)
df["text_vector"] = generator.generate_embeddings(text_col=df["text"])
Last modified 8d ago