Python SDK Changelog
Updates to the Arize Python SDK
Last updated
Updates to the Arize Python SDK
Last updated
Copyright ยฉ 2023 Arize AI, Inc
๐ Bug Fixes
Fix a bug that caused ImportError
when importing Client
from arize.api
๐ New Features
Optional strict typing in pandas logger Schema
Optional strict typing in record-at-a-time logger
โ Dependency Changes
Add optional extra dependencies if the Arize package is installed as pip install arize[NLP_Metrics]
:
nltk>=3.0.0, <4
sacrebleu>=2.3.1, <3
rouge-score>=0.1.2, <1
evaluate>=0.3, <1
โ Validation Changes
Check that space and API keys are of string type
๐ Bug Fixes
Address backward compatibility issue for batch logging via Pandas for on-prem customers
โDependency Updates:
Our Tracing extra requirements now include deprecated
, a dependency coming from opentelemetry-semantic-conventions
, which absence produced an ImportError
๐ New Features:
New batch ingestion via Pandas DataFrames for MULTICLASS
model type
New TRACING
environment. You can now log spans & traces for your LLM applications into Arize using batch ingestion via Pandas DataFrames
Removed size limitation on our Schema. You can now log wider models (more columns in your DataFrame)
โ Validation Changes
Prediction ID and Ranking Group ID have an increased character limit from 128 to 512
โDependency Updates:
Our MimicExplainer extra requirements are now more relaxed.
We only require interpret-community[mimic]>=0.22.0,<1
๐ New Features:
New MULTICLASS
model type available for record-at-a-time ingestion
๐ Bug Fixes:
Fix a bug that caused missing columns validation feedback to have repeated columns in the message
Fix a bug that caused a KeyError
when llm_params
is not found in the dataframe. Improved feedback to the user was included.
๐ New Features
Enable latent actuals for GENERATIVE_LLM
models
๐ฌ Feedback Enhancements
Enable feedback when files are too large for better user experience and troubleshooting
โ Dependency Changes
Updated pandas
requirement. We now accept pandas 2.x
๐ Bug Fixes:
Default prediction sent as string for GENERATIVE_LLM
single-record-logger (before it was incorrectly set as an integer, resulting in it being categorized as prediction score instead of prediction label)
๐ Bug Fixes:
Check the value of prompt/response raw_data only if not None
๐ New Features
Add CORPUS
support
Accept strings for prompt and response
Make prompt and response optional
Add support for a list of strings features in single-record-logger
๐ Bug Fixes:
Avoid creating a view of a Pandas dataframe
โ Validation Changes
Add validation on embedding raw data for batch and record-at-a-time loggers
Raise validation string limits for string fields
Add truncation warnings for long string fields
New ability to send features with type list[str]
New fields available to send token usage to Arize, both using our pandas batch logger and the single record logger
โ Validation Changes
Increase time interval validation from 2 years to 5 years
โDependency Changes
Require python>=3.6
(as opposed to python>=3.8
) for our core SDK. Our extras still require python>=3.8
. See Python SDK for more details.
Require pyarrow>=0.15.0
(as opposed to pyarrow>=5.0.0
)
๐ New Features
Add prompt templates and LLM config fields to the single log and pandas batch ingestion. These fields are used in the Arize Prompt Template Playground
โ Validation Changes
Add a validation check that fails if there are more than 30 embedding features sent
๐ New Features
Add filtering via the keyword where
to the Exporter client
๐ New Features
AutoEmbeddings supports any model in the HuggingFace Hub, public or private.
Add AutoEmbeddings UseCase
for Object Detection
Add EmbeddingGenerator.list_default_models()
method
๐ Deprecations
Computer Vision AutoEmbeddings switched from using FeatureExtractor
(deprecated from HuggingFace) to ImageProcessor
class
๐ New Features
Authenticating Arize Client using environment variables
๐ Bug Fixes
A bug causing permission errors for pandas logging using Windows machines
A bug forcing tags to be strings
๐ New Features
Add Generative LLM model-type support for single-record logging
โDependency Changes
Removed dependency on interpret
for the MimicExplainer
โ Enhancements
Add a progress bar to the Exporter client
Sort exported dataframe by time
Update reserved headers
โ Validation Changes
Add validation check to Exporter client that will fail if start_time > end_time
๐ Bug Fixes
Add bug causing to return an error when a query returns no data. Instead, return an empty response
A bug causing the Exporter client to return empty columns in the dataframe if there was no data in them
A bug causing incorrect parsing of GENERATIVE_LLM
model fields: prompt
& response
โ Dependency Changes
Add missing dependency for Exporter: tqdm>=4.60.0,<5
โ Dependency Changes
Relax protobuf requirements from protobuf~=3.12
to protobuf>=3.12, <5
๐ New Features
Python Export Client, you can now export data from Arize using the Python SDK
๐ Bug Fixes
A bug preventing REGRESSION
models from using the MimicExplainer
โ Validation Changes
Remove null value validation for prediction_label
and actual_label
from single-record logging
Add model mapping rules validation for OBJECT_DETECTION
models
๐ฌ Feedback Enhancements
Improve error messages around prediction ID, prediction labels, and tags
๐ Bug Fixes
A bug causing predictions to be sent as scores instead of labels for NUMERIC
model types
โ Validation Changes
Add a validation check that will fail if the character limit on tags (1000 max) is exceeded
Add a validation check that will fail if actuals are sent without prediction ID information (for single-record logging). This would result in a delayed record being sent without a prediction ID, which is necessary for the latent join
Add a validation check that will fail if the Schema
, without prediction columns, does not contain a prediction ID column (for pandas logging). This would result in a delayed record being sent without a prediction ID, which is necessary for the latent join
Add a validation check that will fail if the Schema
points to an empty string as a column name
Add check for invalid index in AutoEmbeddings: DataFrames must have a sorted, continuous index starting at 0
Remove label requirements & accept null values on SCORE_CATEGORICAL
, NUMERIC
, and RANKING
models
Allow feature and tag columns to contain null values for pandas logging
Allow to send delayed actuals for RANKING
models, it is no longer enforced the presence of rank
and prediction_group_id
columns in the Schema
. However, if the columns are sent, they must not have nulls, since we cannot construct predictions with either value null
โ Dependency Changes
Change optional dependency for MimicExplainer
, raise the version ceiling of lightgbm
from 3.3.4 to 4
๐ Bug Fixes
A bug causing GENERATIVE_LLM
models to be sent as SCORE_CATEGORICAL
models
๐ New Features
Add Object Detection model-type support
Add Generative LLM model-type support for pandas logging
Add evaluation metrics generation for Generative LLM models
Make prediction IDs optional
Add summarization UseCase
to AutoEmbeddings
Add optional, additional custom headers to Client
instantiation
๐ฌ Feedback Enhancements
Add a warning message when only actuals are sent
Add a descriptive error message when embedding features are sent without a vector
Add warning when prediction label or prediction ID will be defaulted
๐ Bug Fixes
A bug causing skipped validation checks when the new REGRESSION and CATEGORICAL model types are selected
โ Validation Changes
Add a validation check that will fail if the character limit on prediction ID (128 max) is exceeded
Add a validation check that will fail if there are duplicated columns in the dataframe
Changed time range requirements to -2/+1 (two years in the past, and 1 future year)
โ Dependency Changes
Require Python >= 3.8
Add optional extra dependencies if the Arize package is installed as pip install arize[LLM_Evaluation]
:
nltk>=3.0.0, <4
sacrebleu>=2.3.1, <3
rouge-score>=0.1.2, <1
evaluate>=0.3, <1
๐ Deprecations
Remove numeric_sequence
support