Arize AI
Search…
GCS Example
How to set up an import job to ingest data into Arize using Google Cloud Storage (GCS)
A few steps need to be completed for Arize to start syncing your files as model inferences.
If you prefer to use Terraform, jump to Applying Bucket Policy & Tag via Terraform

Step 1. Get the Bucket Name and Prefix

Create a bucket (if you don't have one already) and folder (optional) from which you would like Arize to pull a particular model's inferences.
For example you might setup an GCS bucket and folder named gcs://bucket1/click-thru-rate/production/v1/ that contains CSV files of your model inferences.
In this example, your bucket name is bucket1 and your prefix is click-thru-rate/production/v1/.

Step 2. Start the File Import Wizard

Navigate to File Import section of the your workspace in the Arize platform.
Navigate to the File Imports page
  • Create a new file import job. And follow the steps to create a file import job.
  • Storage Selection: Google Cloud Storage
Select Google Cloud Storage Option
  • Fill in Bucket Name, Prefix, and Project Id details.
    • The bucket name and prefix correspond to the path where your files were uploaded (step 1).
    • The GCS Project ID is a unique identifier for a project. See GCS Docs for steps on how to retrieve this ID.

Step 3. Add Proof of Ownership to your Bucket.

Tag your bucket with the key as arize-ingestion-key and the value as the provided tag value. For more details, see docs on Using Bucket Labels.
  • In Arize UI: Copy arize-ingestion-key value
Capture your unique arize-ingestion-key
  • In Google Cloud console: Navigate to Cloud Storage
    • Here, you will see a list of your buckets. Find the bucket matching the bucket name set in your job (step 2), select the button for more options, and update its labels to include the arize-ingestion-key.
      • Key as "arize-ingestion-key"
      • Value as the arize-ingestion-key value from the Arize UI

Step 4. Grant Arize access privileges

  • Create a custom role
    • Copy the command from the Custom IAM Role field in Arize UI
  • Paste and run gcloud command in Google Cloud Shell. Be sure to set --project to your project id.
Start the Google Cloud Shell
Create the IAM Role
  • Grant Arize access to the custom role
    • Copy the command from the Apply IAM Permission Field in the Arize UI.
  • Paste and run the gsutil command in the Google Cloud Shell. Be sure to update your project id in the service account path.
Apply the IAM Permissions

Step 5. Configure your model and define your file's schema

Continue to the next pages of the job creation workflow. You will be asked to define your model and your file's schema either through form inputs or through a json schema (see doc for more details).
Set up model configurations
Map your file using form inputs
Map your file using a JSON schema
Once finished, your import job will be created and will start polling your bucket for files.

Step 6. Add model data to the bucket

Put your model's inference data under the set prefix and ensure the columns match the model schema. Arize file import jobs manage workers that continuously keep track of your buckets so any new file will be discovered and processed during the next import interval.

Step 7. Check your File Import Job

You will see your import job start to load once you have set up your file import job settings correctly. The File Imports page will list out files that are ingested successfully or that failed to process.

Applying Bucket Policy & Tag via Terraform

resource "google_storage_bucket" "arize-example-bucket" {
// (optional) uniform_bucket_level_access = true
name = "arize-example-bucket"
project = google_project.development.project_id
labels = {
"arize-ingestion-key" = "value_from_arize_ui"
}
}
resource "google_project_iam_custom_role" "arize-example-bucket" {
description = "permission to view storage bucket, and view and list objects"
permissions = [
"storage.buckets.get",
"storage.objects.get",
"storage.objects.list"
]
project = google_project.development.project_id
role_id = "FileImporterViewer"
title = "File Importer Viewer"
stage = "ALPHA"
}
resource "google_storage_bucket_iam_binding" "arize-example-bucket-iam-binding" {
bucket = google_storage_bucket.arize-example-bucket.name
role = "projects/<PROJECT_ID>/roles/FileImporterViewer"
members = [
"serviceAccount:[email protected]",
]
}
If you are seeing issues with your import job, please see: File Importer Troubleshooting
Copy link
Outline
Step 1. Get the Bucket Name and Prefix
Step 2. Start the File Import Wizard
Step 3. Add Proof of Ownership to your Bucket.
Step 4. Grant Arize access privileges
Step 5. Configure your model and define your file's schema
Step 6. Add model data to the bucket
Step 7. Check your File Import Job
Applying Bucket Policy & Tag via Terraform