Online Evals

You can use cron to run evals client-side as your traces and spans are generated, augmenting your dataset with evaluations in an online manner. View the example in Github.

This example:

  • Continuously queries a LangChain application to send new traces and spans to your Phoenix session

  • Queries new spans once per minute and runs evals, including:

    • Hallucination

    • Q&A Correctness

    • Relevance

  • Logs evaluations back to Phoenix so they appear in the UI

The evaluation script is run as a cron job, enabling you to adjust the frequency of the evaluation job:

* * * * * /path/to/python /path/to/run_evals.py

The above script can be run periodically to augment Evals in Phoenix.

Last updated