Quickstart: Evaluation
Last updated
Last updated
Copyright © 2023 Arize AI, Inc
This guide will show you how to setup an online evaluation in the Arize UI, which runs on your data automatically as your LLM application is used. You can setup LLM as a judge evaluators or code evaluators, which run in a python container against your data.
If you'd like to run your own evaluations offline using the Arize SDK, click here to follow this guide.
Navigate to the tasks page and click "new task".
Choose which traces you want to evaluate, and how often you want to run it. You can run it once to fill historical data, or you can run it continuously against new data.
Choose the LLM provider, model, and other parameters for your LLM evaluation.
Select one of our pre-built evaluators, or use copilot to write one on your behalf! You can also write your own evaluation template or write python code to evaluate your LLM outputs.
Once your task is successfully created, a green pop-up notification will appear! Navigate to the tracing page to view the evaluation labels generated by your newly created task.