Add prompt templates and variables to your dataset

Datasets are critical to evaluating and benchmarking your LLM apps across different use cases and test scenarios. Using experiments,

During instrumentation

During dataset creation

In this dataset, we are setting attributes.llm.prompt_template.variables to a dictionary converted to a JSON string. Conforming to the openinference semantic conventions here allows you to use these attributes in prompt playground, and they will correctly import as input variables.

import pandas as pd
import json
from arize.experimental.datasets.utils.constants import GENERATIVE

PROMPT_TEMPLATE = """
You are an expert product manager recommending features for a target user. 

Persona: {persona}
Problem: {problem}
"""

data = [
    {
        "attributes.llm.prompt_template.template": PROMPT_TEMPLATE,
        "attributes.llm.prompt_template.variables": json.dumps({
            "persona": "An aspiring musician who is writing their own songs",
            "problem": "I often get stuck overthinking my lyrics and melodies.",
        })
    },
    {
        "attributes.llm.prompt_template.template": PROMPT_TEMPLATE,
        "attributes.llm.prompt_template.variables": json.dumps({
            "persona": "A Christian who goes to church every week",
            "problem": "I'm often too tired for deep Bible study at the end of the day.",
        })
    },
]

df = pd.DataFrame(data)

dataset_id = client.create_dataset(
    space_id="YOUR_SPACE_ID", 
    dataset_name="Your Dataset",
    dataset_type=GENERATIVE,
    data=df
)

Here's how it looks importing the dataset into prompt playground, making it very easy to iterate on your prompt and test new outputs across many data points.

Last updated

Copyright © 2023 Arize AI, Inc