Natural Language Processing (NLP)
How to log your model schema for text classification use cases
Last updated
How to log your model schema for text classification use cases
Last updated
Copyright © 2023 Arize AI, Inc
Text Classification Models predict the categories a piece of text might belong to.
NLP Cases | Expected Fields | Performance Metrics |
---|---|---|
*all classification variant specifications apply to the NLP model type, with the addition of embeddings
The EmbeddingColumnNames
class constructs your embedding objects. You can log them into the platform using a dictionary that maps the embedding feature names to the embedding objects. See our API reference for more details.
Example Row
text_vector | text | prediction_label | actual_label | prediction_score | actual_score | Timestamp |
---|---|---|---|---|---|---|
Arize supports logging the embedding features associated with the text the model is acting on and the text itself using the EmbeddingColumnNames
object.
The data_column_name
should be the name of the column where the raw text associated with the vector is stored. It is the field typically chosen for NLP use cases. The column can contain both strings (full sentences) or a list of strings (token arrays).
See here for more information on embeddings and options for generating them.
The vector_column_name
should be the name of the column where the embedding vectors are stored. The embedding vector is the dense vector representation of the unstructured input. Note: embedding features are not sparse vectors.
The embedding vector
is the dense vector representation of the unstructured input. Note: embedding features are not sparse vectors.
*prediction label, actual label, prediction score, actual score
Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity
*prediction label, actual label, prediction score, actual score
Accuracy, Recall, Precision, FPR, FNR, F1, Sensitivity, Specificity
positive
neutral
0.3
1