A Task is any function that you want to test on a dataset. It can be an LLM generation or a general purpose function.
Here's the simplest version of a task:
def task(dataset_row: Dict):
return dataset_row
When you create a dataset, each row is stored as a dictionary with attributes you can retrieve within your task. This can be the user input, the expected output for an evaluation task, or metadata attributes.
Here's an example where you retrieve each of those values in your task.
"""Example dataset
dataframe = pd.DataFrame({
"id":[1,2],
"attribute": ["example", "example2"],
"question": ["what is 1+1", "why is the sky blue"],
"expected": ["2", "because i said so"]
})
"""
# for the first row of your data, the answer is in the comments
def task(dataset_row: Dict):
question = dataset_row.get("question") # what is 1+1
expected = dataset_row.get("expected") # 2
attribute = dataset_row.get("attribute") # example
data_id = dataset_row.get("id") # 1
return expected
Let's create a task that uses an LLM to answer a question.