Arize AI
Search…
Data Ingestion FAQ
Frequently asked questions about data ingestion

1. I've sent actuals, but I don't see them show up on Arize.

Arize uses the prediction_id field to join the actual back to its corresponding prediction at a later time, or right away if you already know the ground truth about a prediction.
If an actual does not have its prediction_id field matching on a previously sent prediction_id of a prediction, the actual will not be displayed even if it is received by Arize.
Your model and predictions will usually show up immediately when you log them to the Arize platform. The time that it takes actuals to show up depends on the way they were sent:
  • Together with predictions - in this case you can expect to see actuals, as well as performance metrics, usually 10 minutes after being received by Arize.
  • Delayed - if you send actuals at a later time, since they might be unknown at inference, we match them to their corresponding prediction once per day.
Arize looks back 14 days to match an actual to its corresponding prediction.

2. Did Arize receive my prediction/actual record?

When you log to Arize, the SDK returns a list of futures. To check whether the call was successful, you can check the response code for the API call. Try using the following segment of code to test your responses.
1
import concurrent.futures as cf
2
3
def arize_responses_helper(responses):
4
"""
5
responses: a list of responses from Arize
6
returns: None
7
"""
8
for response in cf.as_completed(responses):
9
res = response.result()
10
if res.status_code != 200:
11
raise ValueError(f'failed with code {res.status_code}, {res.text}')
12
13
# Logging to Arize, returns a list of responses
14
responses = arize.log(...) # your log call
15
# Check responses!
16
arize_responses_helper(responses)
Copied!
After receiving a 200 response code, head over to your model's Data Ingestion Tab to confirm that Arize has received your data.
The model's inferences are indexed by the received timestamp, NOT the timestamp of the inferences.

3. How do I troubleshoot if I don't see my predictions/actual record in the Arize Platform?

Please reach out to our team if your team encounters data issues.
A common issue is mismatched IDs between predictions and actuals. One way to troubleshoot this yourself is using our data export feature.
At the top-right corner of every widget, there is an option to export data from the widget. You will get a link in your email with the data export. When the data is exported, it arrives with a Google Colab notebook where you can use pandas dataFrames to view the raw data sent into the platform. This is helpful to troubleshoot count of matched actuals.

Data Export

In some cases, you may want to export the data to ensure the correct data is inside of the Arize platform or to troubleshoot your data.
First set the date range for the data you wish to export.
Next, you can navigate to your dashboard and export predictions and actuals from your model.
An email will be sent the user's email exporting the data. You can copy the link to open the associated Colab.
Next, you can paste our data url into the Colab.
Following the rest of the Colab, the data will be transformed into a pandas dataframe.

4. What happens if we upload the same data with the same prediction ID twice? Does Arize treat that as one prediction/observation or as two?

They are treated as separate observations. This would mean that 2 predictions sent with the same prediction ID would count as 2 predictions. If there was an actual sent for both 2 predictions, it would show up as 2 separate predictions with both having a corresponding matching actual.
Questions? Email us at [email protected] or Slack us in the #arize-support channel

5. What are the Supported Data Types for the Python SDK?

We currently support the following data types for the corresponding columns.
Column Type
Supported Data Types
Features
int, float, str, bool
Prediction ID
int, str
Prediction Timestamps
int, float, date, datetime
Prediction Score
int, float
Actual Score
int, float
SHAP values
int, float
Supported data types for Prediction and Actual labels and scores depends on the model types.
Column Type
Score Categorical
Numeric
Prediction Label
str
int, float
Actual Label
str
int, float
Prediction Score
int, float
NA
Actual Score
int, float
NA
Actual Numeric Sequence
List[int, float]
NA

6. What if my predictions/actuals have None/NaN/Inf values?

They are generally not allowed, i.e. would cause the dataset to be rejected, with the exceptions being Prediction Score and Actual Score, which are treated as empty if missing/omitted.
For more information about these fields, see Model Types.
In the case of Pandas DataFrame, no column should have Mixed Types.

Classification

Field
arize.pandas
arize.log()
Prediction Label
Not allowed
Not allowed
Prediction Score
Treated as empty
Treated as empty
Actual Label
Not allowed
Not allowed
Actual Score
Treated as empty
Treated as empty

Regression

Field
arize.pandas
arize.log()
Prediction Label
Not allowed
Not allowed
Actual Label
Not allowed
Not allowed

7. What if my features have None/NaN/Inf values?

They are accepted and treated as empty.