Embedding

Arize class to map up to 3 columns (vector, data, and link_to_data) to a single embedding feature.

class Embedding(
    vector: List[float]
    data: Optional[Union[str, List[str]]]
    link_to_data: Optional[str]
)
ParametersData TypeDescription

vector

List[float]

(Required) Vector of a given embedding feature. The contents of this column must be List[float] or nd.array[float]

data

str or List[str]

(Optional) Used for Natural Language Processing model type. Data of a given embedding feature, typically the raw text associated with the embedding vector.

link_to_data

str

(Optional) Used for Computer Vision model type. Link to data of a given embedding feature, typically a link to the data file (image, audio, ...) associated with the embedding vector. Host data in a cloud storage provider (GCS, AWS, Azure), local server, or public URL. Navigate here to view private AWS S3 image links. Example: "https://link-to-my-image.png" NOTE: Currently only supports links to image files.

Code Example

from arize.utils.types import ModelTypes, Environments, Embedding

# Example NLP embedding features
embedding_features = {
        "nlp_embedding": Embedding(
            vector=pd.Series([4.0, 5.0, 6.0, 7.0]),
            data="This is a test sentence",
        ),
}

# Example CV embedding features
embedding_features = {
        "image_embedding": Embedding(
            vector=np.array([1.0, 2, 3]),
            link_to_data="https://link-to-my-image.png",
        ),
}

Last updated

Copyright © 2023 Arize AI, Inc