The evaluator of run_experiment can be a callable function. The function should have optional inputs of the following:
Parameter name
Description
Example
input
experiment run input
def eval(input): ...
output
experiment run output
def eval(output): ...
dataset_row
the entire row of the data, including every column as dictionary key
def eval(dataset_row): ...
metadata
experiment metadata
def eval(metadata): ...
Define Function Evaluator and Run Experiment
defedit_distance(dataset_row,output): str1 = dataset_row['attributes.str1']#Input used in task str2 = output #Output from task dp = [[i + j if i * j ==0else0for j inrange(len(str2) +1)] for i inrange(len(str1) +1)]for i inrange(1, len(str1) +1):for j inrange(1, len(str2) +1): dp[i][j] = dp[i - 1][j - 1] if str1[i - 1] == str2[j - 1] else 1 + min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1])
return dp[-1][-1]`experiment1 = arize_client.run_experiment(space_id=space_id, dataset_name=dataset_name, task=prompt_gen_task, evaluators=[edit_distance], experiment_name="test")
Evaluator as a Class
Users have the option to run an experiment by creating an evaluator that inherits from the Evaluator(ABC) base class in the Arize Python SDK. The evaluator takes in a single dataset row as input and returns an EvaluationResult dataclass.