12.05.2024
New Releases, Enhancements, + Changes
Last updated
New Releases, Enhancements, + Changes
Last updated
The Copilot Span Chat skill makes getting value from spans faster and easier. Rather than spending time scrolling through and deciphering span data , teams can now:
Analyze spans to extract key insights
Ask questions to quickly understand span data
Run evaluations on individual spans
Dashboard Widget Generator
Building dashboard plots just got way easier. The dashboard skill lets teams:
Create time series plots or distributions from natural language
Translate code (like Plotly) into ready-to-go visualizations
Handle ambiguous filters like "west coast states" and plot multiple widgets at once
Misc. Copilot Updates
We’ve revamped the main chat experience to be always accessible on the page, with an option to collapse the input bar
The Custom Metric skill now supports a conversational flow, making it easier for users to iterate and refine metrics dynamically
Experiment traces for a dataset are now consolidated and can be accessed under "Experiment Projects" on the "Projects & Models" page.
We’ve just rolled out per-class calibration metrics and calibration chart. Users can see calibration scores for each class separately and view the calibration chart all in one place.
To view per-class calibration simply select calibration from the metric dropdown and choose a class
The calibration chart can be found under the "More Charts" tab
Log experiments from a previously created dataframe
The latest video tutorials, paper readings, ebooks, self-guided learning modules, and technical posts:
🧑⚖️ Agent-as-a-Judge: Evaluate Agents with Agents
🤖 LLM-as-a-Judge Evaluation for GenAI Use-Cases
🌎 Building an AI Agent that Thrives in the Real World