Create a synthetic dataset using LLMs

When you are first developing with LLMs, you typically start with a prompt and little else. The early iteration gets you to a point where the video demo looks amazing, but there's a lack of confidence in its reliability and robustness.

This is where you can use LLMs to generate examples for you based on your prompt. Here's an example, where we can use ChatGPT or your LLM of choice to create a set of examples you can upload to Arize.

You are a data analyst. You are using LLMs to summarize a document. Create a CSV of 20 test cases with the following columns:

1. Input: The full document text, usually five paragraphs of articles about beauty products.
2. Prompt Variables: A JSON string of metadata attached to the article, such as the article title, date, and website URL
3. Output: The one line summary

This will generate a CSV file for you that looks like:

Coming soon, you'll be able to do this directly in the Arize platform based on your traces and prompts, but in the interim, you can upload this data with code.

Last updated

Copyright ยฉ 2023 Arize AI, Inc