Phoenix
TypeScript APIPython APICommunityGitHubPhoenix Cloud
English
  • Documentation
  • Self-Hosting
  • Cookbooks
  • SDK and API Reference
  • Release Notes
  • Resources
English
  • Arize Phoenix
  • Quickstarts
  • User Guide
  • Environments
  • Phoenix Demo
  • 🔭Tracing
    • Overview: Tracing
    • Quickstart: Tracing
      • Quickstart: Tracing (Python)
      • Quickstart: Tracing (TS)
    • Features: Tracing
      • Projects
      • Annotations
      • Sessions
    • Integrations: Tracing
      • OpenAI
      • OpenAI Agents SDK
      • LlamaIndex
      • LlamaIndex Workflows
      • LangChain
      • LangGraph
      • LiteLLM
      • Anthropic
      • Amazon Bedrock
      • Amazon Bedrock Agents
      • VertexAI
      • MistralAI
      • Google GenAI
      • Groq
      • Hugging Face smolagents
      • CrewAI
      • Haystack
      • DSPy
      • Instructor
      • OpenAI Node SDK
      • LangChain.js
      • Vercel AI SDK
      • LangFlow
      • BeeAI
      • Flowise
    • How-to: Tracing
      • Setup Tracing
        • Setup using Phoenix OTEL
        • Setup using base OTEL
        • Using Phoenix Decorators
        • Setup Tracing (TS)
        • Setup Projects
        • Setup Sessions
      • Add Metadata
        • Add Attributes, Metadata, Users
        • Instrument Prompt Templates and Prompt Variables
      • Feedback & Annotations
        • Capture Feedback on Traces
        • Evaluating Phoenix Traces
        • Log Evaluation Results
      • Importing & Exporting Traces
        • Import Existing Traces
        • Export Data & Query Spans
      • Advanced
        • Mask Span Attributes
        • Suppress Tracing
        • Filter Spans
        • Capture Multimodal Traces
    • Concepts: Tracing
      • How Tracing Works
      • FAQs: Tracing
      • What are Traces
  • 📃Prompt Engineering
    • Overview: Prompts
      • Prompt Management
      • Prompt Playground
      • Span Replay
      • Prompts in Code
    • Quickstart: Prompts
      • Quickstart: Prompts (UI)
      • Quickstart: Prompts (Python)
      • Quickstart: Prompts (TS)
    • How to: Prompts
      • Configure AI Providers
      • Using the Playground
      • Create a prompt
      • Test a prompt
      • Tag a prompt
      • Using a prompt
    • Concepts: Prompts
  • 🗄️Datasets & Experiments
    • Overview: Datasets & Experiments
    • Quickstart: Datasets & Experiments
    • How-to: Datasets
      • Creating Datasets
      • Exporting Datasets
    • Concepts: Datasets
    • How-to: Experiments
      • Run Experiments
      • Using Evaluators
  • 🧠Evaluation
    • Overview: Evals
      • Agent Evaluation
    • Quickstart: Evals
    • How to: Evals
      • Pre-Built Evals
        • Hallucinations
        • Q&A on Retrieved Data
        • Retrieval (RAG) Relevance
        • Summarization
        • Code Generation
        • Toxicity
        • AI vs Human (Groundtruth)
        • Reference (citation) Link
        • User Frustration
        • SQL Generation Eval
        • Agent Function Calling Eval
        • Agent Path Convergence
        • Agent Planning
        • Agent Reflection
        • Audio Emotion Detection
      • Eval Models
      • Build an Eval
      • Build a Multimodal Eval
      • Online Evals
      • Evals API Reference
    • Concepts: Evals
      • LLM as a Judge
      • Eval Data Types
      • Evals With Explanations
      • Evaluators
      • Custom Task Evaluation
  • 🔍Retrieval
    • Overview: Retrieval
    • Quickstart: Retrieval
    • Concepts: Retrieval
      • Retrieval with Embeddings
      • Benchmarking Retrieval
      • Retrieval Evals on Document Chunks
  • 🌌inferences
    • Quickstart: Inferences
    • How-to: Inferences
      • Import Your Data
        • Prompt and Response (LLM)
        • Retrieval (RAG)
        • Corpus Data
      • Export Data
      • Generate Embeddings
      • Manage the App
      • Use Example Inferences
    • Concepts: Inferences
    • API: Inferences
    • Use-Cases: Inferences
      • Embeddings Analysis
  • 🔌INTEGRATIONS
    • Phoenix MCP Server
    • Cleanlab
    • Ragas
  • ⚙️Settings
    • Access Control (RBAC)
    • API Keys
    • Data Retention
Powered by GitBook

Platform

  • Tracing
  • Prompts
  • Datasets and Experiments
  • Evals

Software

  • Python Client
  • TypeScript Client
  • Phoenix Evals
  • Phoenix Otel

Resources

  • Container Images
  • X
  • Blue Sky
  • Blog

Integrations

  • OpenTelemetry
  • AI Providers

© 2025 Arize AI

On this page
  • From CSV
  • From Pandas
  • From Objects
  • Synthetic Data
  • From Spans

Was this helpful?

Edit on GitHub
  1. Datasets & Experiments
  2. How-to: Datasets

Creating Datasets

PreviousHow-to: DatasetsNextExporting Datasets

Last updated 10 months ago

Was this helpful?

From CSV

When manually creating a dataset (let's say collecting hypothetical questions and answers), the easiest way to start is by using a spreadsheet. Once you've collected the information, you can simply upload the CSV of your data to the Phoenix platform using the UI. You can also programmatically upload tabular data using Pandas as

From Pandas

import pandas as pd
import phoenix as px

queries = [
    "What are the 9 planets in the solar system?",
    "How many generations of fundamental particles have we observed?",
    "Is Aluminum a superconductor?",
]
responses = [
    "There are 8 planets in the solar system.",
    "We have observed 3 generations of fundamental particles.",
    "Yes, Aluminum becomes a superconductor at 1.2 degrees Kelvin.",
]

dataset_df = pd.DataFrame(data={"query": queries, "responses": responses})

px.launch_app()
client = px.Client()
dataset = client.upload_dataset(
    dataframe=dataset_df,
    dataset_name="physics-questions",
    input_keys=["query"],
    output_keys=["responses"],
)

From Objects

Sometimes you just want to upload datasets using plain objects as CSVs and DataFrames can be too restrictive about the keys.


ds = px.Client().upload_dataset(
    dataset_name="my-synthetic-dataset",
    inputs=[{ "question": "hello" }, { "question": "good morning" }],
    outputs=[{ "answer": "hi" }, { "answer": "good morning" }],
);

Synthetic Data

One of the quicket way of getting started is to produce synthetic queries using an LLM.

One use case for synthetic data creation is when you want to test your RAG pipeline. You can leverage an LLM to synthesize hypothetical questions about your knowledge base.

In the below example we will use Phoenix's built-in llm_generate, but you can leverage any synthetic dataset creation tool you'd like.

Before running this example, ensure you've set your OPENAI_API_KEY environment variable.

Imagine you have a knowledge-base that contains the following documents:

import pandas as pd

document_chunks = [
  "Paul Graham is a VC",
  "Paul Graham loves lisp",
  "Paul founded YC",
]
document_chunks_df = pd.DataFrame({"text": document_chunks})
generate_questions_template = (
    "Context information is below.\n\n"
    "---------------------\n"
    "{text}\n"
    "---------------------\n\n"
    "Given the context information and not prior knowledge.\n"
    "generate only questions based on the below query.\n\n"
    "You are a Teacher/ Professor. Your task is to setup "
    "one question for an upcoming "
    "quiz/examination. The questions should be diverse in nature "
    "across the document. Restrict the questions to the "
    "context information provided.\n\n"
    "Output the questions in JSON format with the key question"
)

Once your synthetic data has been created, this data can be uploaded to Phoenix for later re-use.

import json

from phoenix.evals import OpenAIModel, llm_generate


def output_parser(response: str, index: int):
    try:
        return json.loads(response)
    except json.JSONDecodeError as e:
        return {"__error__": str(e)}


questions_df = llm_generate(
    dataframe=document_chunks_df,
    template=generate_questions_template,
    model=OpenAIModel(model="gpt-3.5-turbo"),
    output_parser=output_parser,
    concurrency=20,
)
questions_df["output"] = [None, None, None]

Once we've constructed a collection of synthetic questions, we can upload them to a Phoenix dataset.

import phoenix as px

# Note that the below code assumes that phoenix is running and accessible
client = px.Client()
client.upload_dataset(
    dataframe=questions_df, dataset_name="paul-graham-questions",
    input_keys=["question"],
    output_keys=["output"],
)

From Spans

If you have an application that is traced using instrumentation, you can quickly add any span or group of spans using the Phoenix UI.

To add a single span to a dataset, simply select the span in the trace details view. You should see an add to dataset button on the top right. From there you can select the dataset you would like to add it to and make any changes you might need to make before saving the example.

You can also use the filters on the spans table and select multiple spans to add to a specific dataset.

Add a specific span as a golden dataset or an example for further testing
Add LLM spans for fine tuning to a dataset
🗄️
seen below.