LogoLogo
Python SDKSlack
  • Documentation
  • Cookbooks
  • Self-Hosting
  • Release Notes
  • Reference
  • Arize AI
  • Quickstarts
  • ✨Arize Copilot
  • Arize AI for Agents
  • Concepts
    • Agent Evaluation
    • Tracing
      • What is OpenTelemetry?
      • What is OpenInference?
      • Openinference Semantic Conventions
    • Evaluation
  • 🧪Develop
    • Quickstart: Experiments
    • Datasets
      • Create a dataset
      • Update a dataset
      • Export a dataset
    • Experiments
      • Run experiments
      • Run experiments with code
        • Experiments SDK differences in AX vs Phoenix
        • Log experiment results via SDK
      • Evaluate experiments
      • Evaluate experiment with code
      • CI/CD with experiments
        • Github Action Basics
        • Gitlab CI/CD Basics
      • Download experiment
    • Prompt Playground
      • Use tool calling
      • Use image inputs
      • Replay spans
      • Compare prompts side-by-side
      • Load a dataset into playground
      • Save playground outputs as an experiment
      • ✨Copilot: prompt builder
    • Playground Integrations
      • OpenAI
      • Azure OpenAI
      • AWS Bedrock
      • VertexAI
      • Custom LLM Models
    • Prompt Hub
  • 🧠Evaluate
    • Online Evals
      • Run evaluations in the UI
      • Run evaluations with code
      • Test LLM evaluator in playground
      • View task details & logs
      • ✨Copilot: Eval Builder
      • ✨Copilot: Eval Analysis
      • ✨Copilot: RAG Analysis
    • Experiment Evals
    • LLM as a Judge
      • Custom Eval Templates
      • Arize Templates
        • Agent Tool Calling
        • Agent Tool Selection
        • Agent Parameter Extraction
        • Agent Path Convergence
        • Agent Planning
        • Agent Reflection
        • Hallucinations
        • Q&A on Retrieved Data
        • Summarization
        • Code Generation
        • Toxicity
        • AI vs Human (Groundtruth)
        • Citation
        • User Frustration
        • SQL Generation
    • Code Evaluations
    • Human Annotations
  • 🔭Observe
    • Quickstart: Tracing
    • Tracing
      • Setup tracing
      • Trace manually
        • Trace inputs and outputs
        • Trace function calls
        • Trace LLM, Retriever and Tool Spans
        • Trace prompt templates & variables
        • Logging Latent Metadata
        • Trace as Inferences
        • Send Traces from Phoenix -> Arize
        • Advanced Tracing (OTEL) Examples
      • Add metadata
        • Add events, exceptions and status
        • Add attributes, metadata and tags
        • Send data to a specific project
        • Get the current span context and tracer
      • Configure tracing options
        • Configure OTEL tracer
        • Mask span attributes
        • Redact sensitive data from traces
        • Instrument with OpenInference helpers
      • Query traces
        • Filter Traces
          • Time Filtering
        • Export Traces
        • ✨AI Powered Search & Filter
        • ✨AI Powered Trace Analysis
        • ✨AI Span Analysis & Evaluation
    • Tracing Integrations
      • OpenAI
      • OpenAI Agents SDK
      • LlamaIndex
      • LlamaIndex Workflows
      • LangChain
      • LangGraph
      • Hugging Face smolagents
      • Autogen
      • Google GenAI (Gemini)
      • Model Context Protocol (MCP)
      • Vertex AI
      • Amazon Bedrock
      • Amazon Bedrock Agents
      • MistralAI
      • Anthropic
      • LangFlow
      • Haystack
      • LiteLLM
      • CrewAI
      • Groq
      • DSPy
      • Guardrails AI
      • Prompt flow
      • Vercel AI SDK
      • Llama
      • Together AI
      • OpenTelemetry (arize-otel)
      • BeeAI
    • Evals on Traces
    • Guardrails
    • Sessions
    • Dashboards
      • Dashboard Widgets
      • Tracking Token Usage
      • ✨Copilot: Dashboard Widget Creation
    • Monitors
      • Integrations: Monitors
        • Slack
          • Manual Setup
        • OpsGenie
        • PagerDuty
      • LLM Red Teaming
    • Custom Metrics & Analytics
      • Arize Query Language Syntax
        • Conditionals and Filters
        • All Operators
        • All Functions
      • Custom Metric Examples
      • ✨Copilot: ArizeQL Generator
  • 📈Machine Learning
    • Machine Learning
      • User Guide: ML
      • Quickstart: ML
      • Concepts: ML
        • What Is A Model Schema
        • Delayed Actuals and Tags
        • ML Glossary
      • How To: ML
        • Upload Data to Arize
          • Pandas SDK Example
          • Local File Upload
            • File Upload FAQ
          • Table Ingestion Tuning
          • Wildcard Paths for Cloud Storage
          • Troubleshoot Data Upload
          • Sending Data FAQ
        • Monitors
          • ML Monitor Types
          • Configure Monitors
            • Notifications Providers
          • Programmatically Create Monitors
          • Best Practices for Monitors
        • Dashboards
          • Dashboard Widgets
          • Dashboard Templates
            • Model Performance
            • Pre-Production Performance
            • Feature Analysis
            • Drift
          • Programmatically Create Dashboards
        • Performance Tracing
          • Time Filtering
          • ✨Copilot: Performance Insights
        • Drift Tracing
          • ✨Copilot: Drift Insights
          • Data Distribution Visualization
          • Embeddings for Tabular Data (Multivariate Drift)
        • Custom Metrics
          • Arize Query Language Syntax
            • Conditionals and Filters
            • All Operators
            • All Functions
          • Custom Metric Examples
          • Custom Metrics Query Language
          • ✨Copilot: ArizeQL Generator
        • Troubleshoot Data Quality
          • ✨Copilot: Data Quality Insights
        • Explainability
          • Interpreting & Analyzing Feature Importance Values
          • SHAP
          • Surrogate Model
          • Explainability FAQ
          • Model Explainability
        • Bias Tracing (Fairness)
        • Export Data to Notebook
        • Automate Model Retraining
        • ML FAQ
      • Use Cases: ML
        • Binary Classification
          • Fraud
          • Insurance
        • Multi-Class Classification
        • Regression
          • Lending
          • Customer Lifetime Value
          • Click-Through Rate
        • Timeseries Forecasting
          • Demand Forecasting
          • Churn Forecasting
        • Ranking
          • Collaborative Filtering
          • Search Ranking
        • Natural Language Processing (NLP)
        • Common Industry Use Cases
      • Integrations: ML
        • Google BigQuery
          • GBQ Views
          • Google BigQuery FAQ
        • Snowflake
          • Snowflake Permissions Configuration
        • Databricks
        • Google Cloud Storage (GCS)
        • Azure Blob Storage
        • AWS S3
          • Private Image Link Access Via AWS S3
        • Kafka
        • Airflow Retrain
        • Amazon EventBridge Retrain
        • MLOps Partners
          • Algorithmia
          • Anyscale
          • Azure & Databricks
          • BentoML
          • CML (DVC)
          • Deepnote
          • Feast
          • Google Cloud ML
          • Hugging Face
          • LangChain 🦜🔗
          • MLflow
          • Neptune
          • Paperspace
          • PySpark
          • Ray Serve (Anyscale)
          • SageMaker
            • Batch
            • RealTime
            • Notebook Instance with Greater than 20GB of Data
          • Spell
          • UbiOps
          • Weights & Biases
      • API Reference: ML
        • Python SDK
          • Pandas Batch Logging
            • Client
            • log
            • Schema
            • TypedColumns
            • EmbeddingColumnNames
            • ObjectDetectionColumnNames
            • PromptTemplateColumnNames
            • LLMConfigColumnNames
            • LLMRunMetadataColumnNames
            • NLP_Metrics
            • AutoEmbeddings
            • utils.types.ModelTypes
            • utils.types.Metrics
            • utils.types.Environments
          • Single Record Logging
            • Client
            • log
            • TypedValue
            • Ranking
            • Multi-Class
            • Object Detection
            • Embedding
            • LLMRunMetadata
            • utils.types.ModelTypes
            • utils.types.Metrics
            • utils.types.Environments
        • Java SDK
          • Constructor
          • log
          • bulkLog
          • logValidationRecords
          • logTrainingRecords
        • R SDK
          • Client$new()
          • Client$log()
        • Rest API
    • Computer Vision
      • How to: CV
        • Generate Embeddings
          • How to Generate Your Own Embedding
          • Let Arize Generate Your Embeddings
        • Embedding & Cluster Analyzer
        • ✨Copilot: Embedding Summarization
        • Similarity Search
        • Embedding Drift
        • Embeddings FAQ
      • Integrations: CV
      • Use Cases: CV
        • Image Classification
        • Image Segmentation
        • Object Detection
      • API Reference: CV
Powered by GitBook

Support

  • Chat Us On Slack
  • support@arize.com

Get Started

  • Signup For Free
  • Book A Demo

Copyright © 2025 Arize AI, Inc

On this page
  • Why integrate with Amazon EventBridge?
  • Connect Arize AI to your incident management platform
  • Connect Amazon EventBridge to your incident management platform
  • Leverage Arize's drift monitoring capabilities to automate ML training workflows
  • Step 1: Setup a monitor to use as an event trigger
  • Step 2: Create a Lambda function to handle monitor events
  • Step 3: Connect EventBridge with your lambda handler
  • Step 4: Connect downstream workflows via your Lambda handler

Was this helpful?

  1. Machine Learning
  2. Machine Learning
  3. Integrations: ML

Amazon EventBridge Retrain

Leverage Arize's model monitoring to automatically trigger ML Training workflows

Last updated 7 months ago

Was this helpful?

Amazon EventBridge integration requires the use of a service like PagerDuty to publish your model monitoring events to Amazon EventBridge. See the guide for details

Why integrate with Amazon EventBridge?

Integrating with Amazon EventBridge allows teams to easily create event-driven workflows that connect native AWS services with Arize's monitoring capabilities. Want to re-train your model automatically when your model's predictions drift from your model's baseline? AWS EventBridge can help you translate Arize AI's monitoring events into powerful automated workflows.

Connect Arize AI to your incident management platform

Amazon EventBridge integration requires the use of an incident management platform like to act as an event source. For the full list of Amazon partnered event sources, check out the

Connect Amazon EventBridge to your incident management platform

Once you have Arize monitors into your incident management platform, you'll have to configure the incident management platform to publish events to EventBridge.

In this example, we use as our incident management platform and integrate with EventBridge by following the steps outlined in their .

Once the integration is completed, any Arize monitor that fires can be used as a trigger to kick-start a workflow in AWS.

Leverage Arize's drift monitoring capabilities to automate ML training workflows

Arize's model monitoring capabilities can be used to auto-trigger ML pipelines within AWS. In this example, we will walk you through how to use Arize Monitors in conjunction with EventBridge to trigger AirFlow jobs to retrain your model whenever a model suffers from drift.

Step 1: Setup a monitor to use as an event trigger

Navigate to your model's monitor tab, click on new monitor and select drift monitor

Once the monitor is saved and active, you are ready to react to monitor events via EventBridge.

Step 2: Create a Lambda function to handle monitor events

import json

def lambda_handler(event, context):

    # Parse the PD incident dictionary
    event_detail = event['detail']
    incident = event_detail['incident']
    title = incident['title']
    
    # Parse the title to determine if it is the retrain trigger monitor
    should_retrain_model = 'retrain_model_trigger' in title and 'Triggered' in title
    
    # Construct a payload to return from the handler that can be used
    # by downstream workflows (AirFlow etc.)
    return {
        'statusCode': 200,
        'body': {
            retrain: should_retrain_model
        }
    }

Make sure to name your lambda appropriately so that it's easy to identify. Once complete, save and deploy the lambda in the same AWS region that you have EventBridge configured.

Step 3: Connect EventBridge with your lambda handler

Step 4: Connect downstream workflows via your Lambda handler

Fill out the custom drift monitor to match your re-training criteria (e.x. you may want to re-train your model whenever your predictions drift from your model's baseline). Make sure to name the monitor appropriately (e.x. retrain_model ) and to use your Integration Email Address (ex. arize-model-integration@<company>.pagerduty.com - see the )

In order to react to Arize's monitoring events, we need to configure that will parse the incident details and trigger a custom workflow. Below is an example Lambda that can be used as a template when integrating with

We now need to configure an EventBridge to utilize the lambda we created above. Navigate to EventBridge in the AWS console and configure a rule that will invoke the lambda whenever a matching event is fired. The event pattern and rule details may differ depending on your integration. (See .) Once completed, you should have a rule similar to the configuration below:

You are all set! You can now fully automate powerful ML workflows within AWS (AirFlow, SageMaker, etc.). Need further assistance? Please don't hesitate to reach out to us at

📈
PagerDuty guide for details
a Lambda function
PagerDuty's incident event payload:
PagerDuty docs for details
support@arize.com
PagerDuty
PagerDuty
EventBridge documentation
integrated
PagerDuty
integration guide
EventBridge rule configuration