Agent Parameter Extraction

This article covers evaluating how well a model extracts the right parameters from the user query for a tool call. Agents can go awry when the tool call is not using the right parameters and returns irrelevant results.

Use this template when you want to specifically grade the parameter extraction itself, and separate out the tool selection as its own evaluation. Smaller evaluation tasks are generally more accurate, but in turn, it increases the number of evals you have to run and account for.

Prompt Template

You are comparing a function call response to a question and trying to determine if the generated call has extracted the exact right parameters from the question. Here is the data:
    [BEGIN DATA]
    ************
    [Question]: {question}
    ************
    [LLM Response]: {response}
    ************
    [END DATA]

Compare the parameters in the generated function against the JSON provided below.
The parameters extracted from the question must match the JSON below exactly.
Your response must be single word, either "correct", "incorrect", or "not-applicable",
and should not contain any text or characters aside from that word.

"correct" means the function call parameters match the JSON below and provides only relevant information.
"incorrect" means that the parameters in the function do not match the JSON schema below exactly, or the generated function does not correctly answer the user's question. You should also respond with "incorrect" if the response makes up information that is not in the JSON schema.
"not-applicable" means that response was not a function call.

Here is more information on each function:
{function_defintions}

Last updated

Was this helpful?