Phoenix
TypeScript APIPython APICommunityGitHubPhoenix Cloud
English
  • Documentation
  • Self-Hosting
  • Cookbooks
  • SDK and API Reference
  • Release Notes
  • Resources
English
  • Arize Phoenix
  • Quickstarts
  • User Guide
  • Environments
  • Phoenix Demo
  • 🔭Tracing
    • Overview: Tracing
    • Quickstart: Tracing
      • Quickstart: Tracing (Python)
      • Quickstart: Tracing (TS)
    • Features: Tracing
      • Projects
      • Annotations
      • Sessions
    • Integrations: Tracing
      • OpenAI
      • OpenAI Agents SDK
      • LlamaIndex
      • LlamaIndex Workflows
      • LangChain
      • LangGraph
      • LiteLLM
      • Anthropic
      • Amazon Bedrock
      • Amazon Bedrock Agents
      • VertexAI
      • MistralAI
      • Google GenAI
      • Groq
      • Hugging Face smolagents
      • CrewAI
      • Haystack
      • DSPy
      • Instructor
      • OpenAI Node SDK
      • LangChain.js
      • Vercel AI SDK
      • LangFlow
      • BeeAI
      • Flowise
    • How-to: Tracing
      • Setup Tracing
        • Setup using Phoenix OTEL
        • Setup using base OTEL
        • Using Phoenix Decorators
        • Setup Tracing (TS)
        • Setup Projects
        • Setup Sessions
      • Add Metadata
        • Add Attributes, Metadata, Users
        • Instrument Prompt Templates and Prompt Variables
      • Feedback & Annotations
        • Capture Feedback on Traces
        • Evaluating Phoenix Traces
        • Log Evaluation Results
      • Importing & Exporting Traces
        • Import Existing Traces
        • Export Data & Query Spans
      • Advanced
        • Mask Span Attributes
        • Suppress Tracing
        • Filter Spans
        • Capture Multimodal Traces
    • Concepts: Tracing
      • How Tracing Works
      • FAQs: Tracing
      • What are Traces
  • 📃Prompt Engineering
    • Overview: Prompts
      • Prompt Management
      • Prompt Playground
      • Span Replay
      • Prompts in Code
    • Quickstart: Prompts
      • Quickstart: Prompts (UI)
      • Quickstart: Prompts (Python)
      • Quickstart: Prompts (TS)
    • How to: Prompts
      • Configure AI Providers
      • Using the Playground
      • Create a prompt
      • Test a prompt
      • Tag a prompt
      • Using a prompt
    • Concepts: Prompts
  • 🗄️Datasets & Experiments
    • Overview: Datasets & Experiments
    • Quickstart: Datasets & Experiments
    • How-to: Datasets
      • Creating Datasets
      • Exporting Datasets
    • Concepts: Datasets
    • How-to: Experiments
      • Run Experiments
      • Using Evaluators
  • 🧠Evaluation
    • Overview: Evals
      • Agent Evaluation
    • Quickstart: Evals
    • How to: Evals
      • Pre-Built Evals
        • Hallucinations
        • Q&A on Retrieved Data
        • Retrieval (RAG) Relevance
        • Summarization
        • Code Generation
        • Toxicity
        • AI vs Human (Groundtruth)
        • Reference (citation) Link
        • User Frustration
        • SQL Generation Eval
        • Agent Function Calling Eval
        • Agent Path Convergence
        • Agent Planning
        • Agent Reflection
        • Audio Emotion Detection
      • Eval Models
      • Build an Eval
      • Build a Multimodal Eval
      • Online Evals
      • Evals API Reference
    • Concepts: Evals
      • LLM as a Judge
      • Eval Data Types
      • Evals With Explanations
      • Evaluators
      • Custom Task Evaluation
  • 🔍Retrieval
    • Overview: Retrieval
    • Quickstart: Retrieval
    • Concepts: Retrieval
      • Retrieval with Embeddings
      • Benchmarking Retrieval
      • Retrieval Evals on Document Chunks
  • 🌌inferences
    • Quickstart: Inferences
    • How-to: Inferences
      • Import Your Data
        • Prompt and Response (LLM)
        • Retrieval (RAG)
        • Corpus Data
      • Export Data
      • Generate Embeddings
      • Manage the App
      • Use Example Inferences
    • Concepts: Inferences
    • API: Inferences
    • Use-Cases: Inferences
      • Embeddings Analysis
  • 🔌INTEGRATIONS
    • Phoenix MCP Server
    • Cleanlab
    • Ragas
  • ⚙️Settings
    • Access Control (RBAC)
    • API Keys
    • Data Retention
Powered by GitBook

Platform

  • Tracing
  • Prompts
  • Datasets and Experiments
  • Evals

Software

  • Python Client
  • TypeScript Client
  • Phoenix Evals
  • Phoenix Otel

Resources

  • Container Images
  • X
  • Blue Sky
  • Blog

Integrations

  • OpenTelemetry
  • AI Providers

© 2025 Arize AI

On this page
  • Getting Started
  • Basic Example Use Case
  • Creating a Prompt
  • Running over a dataset
  • Updating a Prompt
  • Next Steps

Was this helpful?

Edit on GitHub
  1. Prompt Engineering
  2. Quickstart: Prompts

Quickstart: Prompts (UI)

PreviousQuickstart: PromptsNextQuickstart: Prompts (Python)

Last updated 2 months ago

Was this helpful?

Getting Started

Prompt playground can be accessed from the left navbar of Phoenix.

From here, you can directly prompt your model by modifying either the system or user prompt, and pressing the Run button on the top right.

Basic Example Use Case

Let's start by comparing a few different prompt variations. Add two additional prompts using the +Prompt button, and update the system and user prompts like so:

System prompt #1:

You are a summarization tool. Summarize the provided paragraph.

System prompt #2:

You are a summarization tool. Summarize the provided paragraph in 2 sentences or less.

System prompt #3:

You are a summarization tool. Summarize the provided paragraph. Make sure not to leave out any key points.

User prompt (use this for all three):

In software engineering, more specifically in distributed computing, observability is the ability to collect data about programs' execution, modules' internal states, and the communication among components.[1][2] To improve observability, software engineers use a wide range of logging and tracing techniques to gather telemetry information, and tools to analyze and use it. Observability is foundational to site reliability engineering, as it is the first step in triaging a service outage. One of the goals of observability is to minimize the amount of prior knowledge needed to debug an issue.

Your playground should look something like this:

Let's run it and compare results:

Creating a Prompt

Your prompt will be saved in the Prompts tab:

Now you're ready to see how that prompt performs over a larger dataset of examples.

Running over a dataset

Next, create a new dataset from the Datasets tab in Phoenix, and specify the input and output columns like so:

Now we can return to Prompt Playground, and this time choose our new dataset from the "Test over dataset" dropdown.

You can also load in your saved Prompt:

We'll also need to update our prompt to look for the {{input_article}} column in our dataset. After adding this in, be sure to save your prompt once more!

Now if we run our prompt(s), each row of the dataset will be run through each variation of our prompt.

And if you return to view your dataset, you'll see the details of that run saved as an experiment.

Updating a Prompt

You can now easily modify you prompt or compare different versions side-by-side. Let's say you've found a stronger version of the prompt. Save your updated prompt once again, and you'll see it added as a new version under your existing prompt:

You can also tag which version you've deemed ready for production, and view code to access your prompt in code further down the page.

Next Steps

Now you're ready to create, test, save, and iterate on your Prompts in Phoenix! Check out our other quickstarts to see how to use Prompts in code.

It looks like the second option is doing the most concise summary. Go ahead and .

Prompt playground can be used to run a series of dataset rows through your prompts. To start off, we'll need a dataset. Phoenix has many options to , to keep things simple here, we'll directly upload a CSV. Download the articles summaries file linked below:

From here, you could to test its performance, or add complexity to your prompts by including different tools, output schemas, and models to test against.

📃
save that prompt to your Prompt Hub
upload a dataset
evaluate that experiment
51KB
news-article-summaries-2024-11-04 11_08_10.csv
Uploading a CSV dataset