Embedding Cluster Summarization
Use GPT to summarize clusters of your embeddings
With the overwhelming amount of text data processed by large language models (LLMs) every day, it's challenging to manually comb through each prompt for analysis and insights. The Cluster Summary feature in Arize tackles this problem head-on by automatically grouping related prompts or responses together, turning unwieldy data sets into easily understandable clusters.
How does this work?
Arize sends the raw text from your prompt or response embeddings to Open AI, and uses the LLM selected to generate a summary. Different models may provide different quality of responses, with more functionality coming here soon.
How do I set this up?
Check out our Integrations to see how to gain access to LLM's in Arize.
Why use this feature?
Efficient Analysis: By summarizing and grouping raw text prompts, you can analyze your data more efficiently. No need to go through each prompt manually, saving your team valuable time.
Identify Patterns: Uncover underlying patterns or trends that might otherwise be overlooked. This can provide valuable insights into what your users are asking or discussing most often.
Focus Improvement Efforts: Identify clusters that generate low-quality responses or hallucinations from your LLM. By pinpointing these problematic areas, you can focus your improvement efforts where they're needed most.
Enhanced Understanding: Gain a deeper understanding of your LLM's performance. With a high-level view of the prompts being fed into your model, you can better appreciate how it interprets and responds to different types of input.
Last updated