12.19.2024
New Releases, Enhancements, + Changes
Last updated
New Releases, Enhancements, + Changes
Last updated
The Prompt Hub is a centralized repository for managing, iterating, and deploying prompt templates within the Arize platform. It serves as a collaborative workspace for users to refine and store templates for various use cases, including production applications and experimentation.
Key features of the Prompt Hub include:
Template Management: Users can save templates directly from the Prompt Playground along with associated LLM parameters, function definitions, and metadata required to reproduce specific LLM calls.
Version Control: Every saved template supports versioning, enabling users to track updates, experiment with variations, and revert to previous versions if needed.
Collaboration and Reusability: Saved templates can be shared across teams, facilitating collaboration and consistency in production workflows. Templates can also be reloaded into the Prompt Playground or accessed via APIs for seamless integration into codebases and online tasks.
Evaluation and Optimization: By saving outputs as experiments, users can compare templates, compute evaluation metrics, and analyze performance both quantitatively and qualitatively.
We recently launched a set of pre-built, off-the-shelf evaluators to enable users to evaluate their spans without requiring requests to an LLM-as-a-Judge.
Evaluators available:
Matches Regex: Checks if text matches a specific regular expression pattern.
JSON Parseable: Validate JSON output from LLMs.
Contains Any Keyword: Check if any keywords appear in the text.
Contains All Keywords: Validate that all specified keywords are present.
We just released a new flow for creating experiments from outputs generated with the Prompt Playground. What's new?
Quickly Experiment: After running the playground successfully on a dataset, click the "Save as Experiment" button.
Debug: In addition to the newly outputted response, we save the LLM invocation parameters & prompt template message structure for greater replay functionality.
Compare: Just like our existing experiments, you can compare the playground outputs as well.
We’ve rolled out the first part of our monitor improvements! Here's what's new:
Alert Status Graph: Maps directly to the alerts users see, giving them a transparent and seamless way to line up alerts with the real-time metric visualization.
Cleaner UX: Updates include removing "last run monitor time," aligning card titles and Y-axis with metric names, and simplifying by removing granularity.
Note: Alert ticks are limited—users may need to zoom into specific dates to see all alerts.
Support for sessions
via LangChain native thread tracking in TypeScript is now available. Easily track multi-turn conversations / threads using LangChain.js.
The latest video tutorials, paper readings, ebooks, self-guided learning modules, and technical posts:
How Booking.com Personalizes Travel Planning with AI Trip Planner and Arize AI How to Add LLM Evaluations to CI/CD Pipelines 2025 AI Conferences Merge, Ensemble, and Cooperate! A Survey on Collaborative LLM Strategies