Ask or search…
K
Links

LLM Token and Usage Tracking

AI engineers and platform builders are often in charge of LLMs internally, within an enterprise. To effectively track the usage of these LLMs over time, it is imperative to have dashboards that visualize the core attributes of the LLM systems and applications.
An example LLM dashboard view in Arize
Arize supports tracking core fields for LLMs by easily defining fields that designate LLM token usage and latency as part of the Arize schema. Those fields are shown below with corresponding log calls:
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments, EmbeddingColumnNames
API_KEY = 'ARIZE_API_KEY'
SPACE_KEY = 'YOUR SPACE KEY'
arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)
# Prompt or response can be (optionally) sent as embeddings or text
# Declare prompt as embedding
prompt_columns=EmbeddingColumnNames(
vector_column_name="document_vector"
data_column_name="prompt_raw_text"
)
# Declare prompt as text
response_column="response_raw_text"
# Define the Schema
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
tag_column_names=["sacreBLEU", "rouge1", "rouge2", "rougeL", "rougeLsum"],
prompt_column_names=prompt_columns,
response_column_names=response_column,
#new fields below
llm_run_metadata=LLMRunMetadataColumnNames(
total_token_count_column_name="total_token_count",
prompt_token_count="prompt_token_count",
response_token_count_column_name="response_token_count",
response_latency_ms_column_name="response_latency_ms",
)
)
# Log the dataframe with the schema mapping
response = arize_client.log(
model_id=demo-generative-ai-text-summarization,
model_version= "v1",
model_type=ModelTypes.GENERATIVE_LLM,
environment=Environments.PRODUCTION,
dataframe=test_dataframe,
schema=schema,
)
Defining those fields will allow you to create a default LLM dashboard from the Dashboard view. Learn how to ingest other LLM-related information into Arize here.
(Coming soon) automatic collection of attributes as part of ingestion logging.