Arize AI
Search…
AWS S3 Example
How to set up an import job to ingest data into Arize from an S3 bucket
A few steps need to be completed for Arize to start syncing your files as model inferences.
If you prefer to use Terraform, jump to Applying Bucket Policy & Tag via Terraform

Step 1. Get the Bucket Name and Prefix

Create a bucket (if you don't have one already) and folder (optional) from which you would like Arize to pull a particular model's inferences.
For example you might setup an S3 bucket and folder named s3://arize-models/cc-fraud/ that contains CSV files of your model inferences.

Step 2. Start the File Import Wizard

Navigate to File Import section of the your workspace in the Arize platform.
Navigate to the File Imports page
  • Create a new file import job. And follow the steps to create a file import job.
  • Storage Selection: Amazon S3
Select Amazon S3 option
  • Fill in Bucket Name and Prefix details. Info from Step #1

Step 3. Grant Arize access privileges

  • In Arize UI: Copy the policy supplied by Arize in the file importer job setup
Capture the policy to apply to the bucket
  • In the AWS console: Navigate to your S3 bucket -> Permission -> Edit Bucket Policy
Add/Edit bucket policy
  • In the AWS console: Paste AWS policy from Arize UI
Add policy to your bucket

Step 4. Add Proof of Ownership to your Bucket.

Tag your bucket with the key as arize-ingestion-key and the value as the provided tag value. Ex: AWS Object Tags
  • In Arize UI: Copy arize-ingestion-key value
Capture your unique arize-ingestion-key
  • In AWS Console: Navigate to your S3 bucket -> Properties -> Edit Bucket Policy
Navigate to your bucket properties tab
  • In AWS Console: Set tag Key = arize-ingestion-key and Value as the value copied from Arize UI on Previous Step
Add arize-ingestion-key as a bucket tag

Step 5. Define your file's schema

Make sure to your file's column names are in compliance with Arize's model structure. See file schema

Step 6. Add model data to the bucket

Put your model's inference data under the prefix and ensure the columns match the model schema. Arize file import jobs manage workers that continuously keep track of your buckets so the new file will be discovered and processed during the next import interval.

Step 7. Check to Make Sure File Import Job is Successful

You will see your import job start to load if you have set up your file import settings correctly. Data should be pulled into the Arize platform in the Data Ingestion Tab.

Applying Bucket Policy & Tag via Terraform

resource "aws_s3_bucket" "arize-example-bucket" {
bucket = "my-arize-example-bucket"
tags = {
arize-ingestion-key = "value_from_arize_ui"
}
}
resource "aws_s3_bucket_policy" "grant_arize_read_only_access" {
bucket = aws_s3_bucket.arize-example-bucket.id
policy = data.aws_iam_policy_document.grant_arize_read_only_access.json
}
data "aws_iam_policy_document" "grant_arize_read_only_access" {
statement {
principals {
type = "AWS"
identifiers = ["arn:aws:iam::<REDACTED>:role/arize-importer"]
}
actions = [
"s3:GetBucketTagging",
"s3:GetObject",
"s3:ListBucket",
]
resources = [
aws_s3_bucket.arize-example-bucket.arn,
"${aws_s3_bucket.arize-example-bucket.arn}/*",
]
}
}
If you are seeing issues with your import job please see: File Importer Troubleshooting
Copy link
Outline
Step 1. Get the Bucket Name and Prefix
Step 2. Start the File Import Wizard
Step 3. Grant Arize access privileges
Step 4. Add Proof of Ownership to your Bucket.
Step 5. Define your file's schema
Step 6. Add model data to the bucket
Step 7. Check to Make Sure File Import Job is Successful
Applying Bucket Policy & Tag via Terraform