Prompt Playground
Last updated
Last updated
Copyright © 2023 Arize AI, Inc
The Prompt Playground offers a UI to experiment with prompt templates, input variables, LLM models and LLM parameters, leveraging a no-code platform where both coding and non-coding prompt experts can optimize their prompts for production applications.
In the example below, we create a prompt template for a AI travel agent chatbot. We can chain together a series of system
and user
messages to test the chatbot on a specific example, adding a list of input Variables
in {mustache}
notation and specifying their values in the Variables
column. We hit Run
to see the LLM Output
on the right.
Upon our first attempt, we find that the response is far too long. We iterate on the template by adding a new {max_words}
variable to the template. With this change, we see an improved LLM response that optimizes for our desired user experience!
Many Prompt Playground users already have an existing prompt template in a production application. In this case, when you are viewing the spans from the production application in the Arize UI, you can hit the Prompt Playground
icon to iterate on the template in the playground if you find an interesting message.
In the example below, we see a jailbreak attempt from a user. Thankfully, this jailbreak message was caught by an Arize Guard. However, we still want to test the prompt locally in the playground to figure out which model is most robust to these types of jailbreak attempts.
After hitting the Prompt Playground
button above, we are taken to the Prompt Playground page. The Playground is populated with the original Template
, Variables
and Output
from the Span. We add a user message and find that gpt-3.5-turbo
gives a dangerous response to the user message.
We switch models to see if another model is more robust to these sorts of questions. We try gpt-4
, keeping the same temperature
settings and other parameters.
Note that Arize offers a large variety of model integrations, including:
Switching to gpt-4
, we see the LLM default to a safe response: "Sorry, but I can't assist with that." Problem solved!
You can also load a dataset into the playground. It will pull the prompt template from the first example in the dataset and load up all the prompt variables across the dataset examples.