Pre-Built Evals

The following are simple functions on top of the LLM evals building blocks that are pre-tested with benchmark data.

All evals templates are tested against golden data that are available as part of the LLM eval library's and target precision at 70-90% and F1 at 70-85%.

Supported Models.

The models are instantiated and usable in the LLM Eval function. The models are also directly callable with strings.

model = OpenAIModel(model_name="gpt-4",temperature=0.6)
model("What is the largest costal city in France?")

We currently support a growing set of models for LLM Evals, please check out the .

PreviousHow to: Evals NextHallucinations

Last updated 1 month ago

Was this helpful?