Pandas SDK Example
Log model inference data directly to Arize using SDK/API methods
This page shows how to send data to Arize using the Python Pandas SDK.
This page shows how to send data to Arize using the Python Pandas SDK.
This page shows how to send data to Arize using the Python Pandas SDK.
Step 1: Set Up Python SDK
Install Arize SDK
Initialize Arize client from arize.pandas.logger
to call Client.log()
Step 2: Set Model Schema Attributes
A model schema is broken into required and optional parameters. Optional model schema parameters vary based on model types. Learn more about model types here. Gain a comprehensive list of schema attributes and their definitions here.
Example Row
prediction_id | prediction_ts | prediction_label | actual_label | state | states | gender | vector | text | image_link |
---|---|---|---|---|---|---|---|---|---|
1fcd50f4689 | 1637538845 | No Claims | No Claims | ca | [ca, ak] | female | [1.27346, -0.2138, ...] | "This is an example text" | "https://example_ur.jpg" |
Optional: Typed Columns
See Sending Data FAQ for more info on SDK typing features.
Optional: Embeddings
Optional: SHAP Values
Optional: Delayed Actuals
If your model receives delayed actuals, log your delayed production data using the same prediction ID, which links your files together in the Arize platform. This can be delivered days or weeks after the prediction is received.
Step 3: Log Inferences
Arize expects the DataFrame's index to be sorted and begin at 0. If you perform operations that might affect the index prior to logging data, reset the index as follows:
Optional: Metrics Validation
There is an optional argument that specifies desired groups of metrics for validation. Combined with a model_type and based on the schema, Arize will validate that these expected metrics will be available in the platform, and will validate required schema columns.
Call __repr__() on a Metrics enum to see its description:
Learn more about metrics families here.
Other Supported SDKs
Python Pandas SDK (log a pandas dataframe)
Python Single Record SDK (log a single record)
Tutorials on how to log predictions, actuals, and feature importance.
Logging Predictions Only | |
Logging Predictions First, Then Logging Delayed Actuals | |
Logging Predictions First, Then Logging SHAPs After | |
Logging Predictions and Actuals Together | |
Logging Predictions and SHAP Together | |
Logging Predictions, Actuals, and SHAP Together | |
Logging PySpark DataFrames |
Last updated