07.10.2024

New Releases, Enhancements, + Changes

What's New

Copilot

Arize Copilot is an AI assistant for AI, helping users alongside their journey to building better AI. Copilot is an advanced AI tool designed for data scientists and AI engineers to understand and improve their models and applications quickly. It automates insights and troubleshooting, allowing users to swiftly take action to make improvements.

  • Versatile Skill Set: Copilot offers high-level model insights, data quality analysis, and LLM-specific functionalities like evaluation summarization and retrieval process troubleshooting.

  • Advanced LLM Development: Copilot identifies issues and patterns in your evaluation results, suggesting pre-built or custom evaluations.

  • Prompt Optimization: In the prompt playground, Copilot optimizes your prompts based on specific concerns or evaluation data.

  • Powerful Data Curation: Use Copilot’s AI Search to curate data with natural language queries combined with traditional filters.

Users can leverage Copilot through the chat interface, or anywhere they see a Copilot ✨ icon.

To get access, please reach out to your account representative.

Learn more ->

Datasets

Datasets are collections of examples that are used to run experiments, label, or evaluate the performance of an application. Users can easily add selected examples to a new or existing dataset. Learn more ->

Experiments

In AI development, it's hard to understand how a change will affect performance. This breaks the dev flow, making iteration more guesswork than engineering. With Experiments, users can apply a change such as a prompt template change, retrieval approach change, or even LLM change, and apply it across a dataset for evaluation prior to deploying the change into production. Learn more ->

Tasks: Online LLM Evals

As your application scales and the number of production logs increases, it can be cumbersome to manage manually your data. Tasks let you create and run automated actions on your LLM spans. Users can now set up Tasks to automate actions on data, with Online LLM Evaluations (continuous evals) being the first supported task type. Look out for new Task templates in the coming weeks. Learn more ->

Guardrails

Guardrails for LLMs ensure real-time safety, context management, compliance, and user experience. Guardrails provide immediate correction of inappropriate content, and can be applied to either user input messages (e.g. jailbreak attempts) or LLM output messages (e.g. answer relevance). If a message in a LLM chat fails a Guard, then the Guard will take a corrective action, either providing a default response to the user or prompting the LLM to generate a new response. Learn more ->

Upgraded Prompt Playground

Prompt Playground has gotten a redesign to allow for better experimentation and customization of prompt templates and variables. Users can now chain together a series of system and user messages to test the chatbot on a specific example, adding a list of input Variables in {mustache} notation and specifying their values in the Variables column. Learn more ->

Token Counting

Use token counts to find problematic traces, long running conversations or investigate prompt variable or context overflow. Learn more ->

Query Filters in Traces

We’re excited to introduce the new Query Filters on Traces, making it easier and faster to filter your data. Key updates include:

  • Easy Filtering: Click the filter bar to type, select, or use Copilot to search for filters.

  • Improved Visibility: See all applied filters clearly.

  • Fewer Clicks: Add filters with minimal steps.

  • Export & Reuse: Copy filters to Monitors and Dashboards effortlessly.

Enhancements

Ingest Non-Open Inference Attributes

Arize now supports ingestion of non-open inference attributes. If there are other attributes in the otlp request payload (e.g. attributes.customer_name), they will appear in the platform and be filterable and selectable.

📚 New Content

The latest video tutorials, paper readings, ebooks, self-guided learning modules, and technical posts:

Last updated