Create a Task
A Task is a function that runs on a dataset. That function represents something that you want to test in isolation, offline, before you push your code to production. A task can represent a LLM generation/response in an LLM application, the testing of a new LLM template, the testing of a new model, an LLM generating code or a general purpose function.
The inputs to the Task function are an Example Class that represents a row of a dataframe.
The dataframe is convert row by row into an Example that is passed in to the Task under test.
An example has the following parameters:
example.id
The id of the example
example.input
The input column of the dataframe (attributes.llm.input_messages)
example.output
The output column of the dataframe (attributes.output.value)
example.dataset_row
The dataset row of the dataframe is all columns of the dataframe. Arize support any number of columns.
example.metadata
Metadata column of dataframe (attributes.metadata)
Each example within a dataset represents a single data point, consisting of a dataset_row
a dictionary, input
(a string or JSON), an optional output
string, and an optional metadata
dictionary. The optional
output dictionary often contains the the expected LLM application output for the given input.
Testing a New Prompt
Last updated