Create a dataset with code

If you'd like to create your datasets programmatically, you can using our python SDK to create, update, and delete datasets.

To start let's install the packages we need:

!pip install "arize[Datasets]" pandas

Let's get your developer key by clicking "code" on the datasets page.

Let's setup the Arize Dataset Client to create or update a dataset.

from arize.experimental.datasets import ArizeDatasetsClient
import pandas as pd

client = ArizeDatasetsClient(developer_key=developer_key)

You can create many different kinds of datasets. The examples below are sorted by complexity.

If you are looking to upload a standard set of examples with string inputs, you can create the dataframe as such.

import pandas as pd
import json
from arize.experimental.datasets.utils.constants import GENERATIVE

data = [{
    "persona": "An aspiring musician who is writing their own songs",
    "problem": "I often get stuck overthinking my lyrics and melodies.",
}]

df = pd.DataFrame(data)

dataset_id = client.create_dataset(
    space_id="YOUR_SPACE_ID", 
    dataset_name="Your Dataset",
    dataset_type=GENERATIVE,
    data=df
)

Last updated

Copyright © 2023 Arize AI, Inc