Run an Experiment
Using the run_experiment function
Last updated
Was this helpful?
Using the run_experiment function
Last updated
Was this helpful?
Once you have your dataset, task and evaluators defined, you can run your experiment against your data measure and iterate on your LLM outputs.
Here is an quick example:
To run an experiment, you need to specify several required variables:
space_id
(
str
)
โ The ID of the space where the experiment will be run.
experiment_name
(
str
)
โ The name of the experiment.
task
(function)
โ a function to be called with defined inputs and outputs (guide)
dataset_id
(str)
โ the ID of the dataset in the Arize platform you would like to run this experiment against (guide).
You also have several optional variables:
evaluators
(
List[function]
)
โ The list of evaluator functions to run against the experiment outputs (guide). You can run an experiment without evaluators if you'd like.
dry_run
(
bool
)
โ If True, the experiment result will not be uploaded to Arize. Defaults to False.
concurrency
(int) โ The number of concurrent tasks to run. Defaults to 3.
exit_on_error
(
bool
)
โ If True, the experiment will stop running on first occurrence of an error.
You can also fetch datasets from Arize using different input variables. We recommend dataset_id because it's a unique identifier and the simplest to use, but offer these as alternatives for convenience.
dataset_df
(pd.DataFrame)
โ a pandas DataFrame object to pass into the task and experiment. This only works with dry_run = True
dataset_name
(
str
)
โ The name of the dataset to use from the Arize platform.
The space_id, the dataset_id, the task you would like to run, and the list of evaluators defined on the output. This also logs the traces to Arize so you can debug each run.
You can specify dry_run=True
, which does not log the result to Arize. You can also specify exit_on_error=True
, which makes it easier to debug when an experiment doesn't run correctly.
Navigate to the dataset in the UI and see the experiment output table.