Custom Metric Examples
Common example use cases
Example Custom Metrics
Custom metrics are a powerful tool to assess various aspects of your LLM application. From analyzing cost and usage to monitoring application performance, custom metrics offer flexibility to tailor evaluations to your specific needs. Use this page as a guide on how to create custom metrics, though this is not an exhaustive list.
Percent of Correct
This example demonstrates how to calculate the percentage of predictions with an incorrect QA_Correctness_Eval
. We achieve this using a FILTER (WHERE ...)
clause, applying the filter only to the numerator and not the denominator:
Learn more about FILTER (WHERE)
clauses here.
Performance Metrics
Use natively supported performance metrics as functions that can take multiple arguments for enhanced flexibility. Additionally, you can create entirely new metrics using conditionals and other logic, as shown below. Explore the documentation for performance metrics here.
Precision
You can calculate the precision of your evaluations using annotations as the ground truth:
Learn more about PRECISION
and related functions here. You can also use any of our built-in functions to assess the performance of your evaluations. Feel free to utilize a dimension you have traced for your actual values.
Total Costs
Calculate the total cost by summing up completion and prompt token counts:
Average Cost per User Query
Below is an example with the per-input token cost set to 0.0000025
. Adjust it according to your per-token cost:
Alternatively, you can use per-million token costs:
Evaluation Cost Estimate
To calculate evaluation costs, estimate token counts by exporting traces and determining the token length of your evaluation template along with the average output cost. Below is an example where the prompt template includes input and output as variables, allowing for direct token count calculation. Adjust your template accordingly if additional variables are present:
Number of Errors
To determine the total number of errors, use the following query:
Number of Sessions
Estimate the total number of distinct sessions:
Last updated