Overview: Retrieval

PreviousCustom Task Evaluation NextQuickstart: Retrieval

Last updated 1 year ago

Was this helpful?

Overview: Retrieval

Many LLM applications use a technique called Retrieval Augmented Generation. These applications retrieve data from their knowledge base to help the LLM accomplish tasks with the appropriate context.

However, these retrieval systems can still hallucinate or provide answers that are not relevant to the user's input query. We can evaluate retrieval systems by checking for:

Are there certain types of questions the chatbot gets wrong more often?
Are the documents that the system retrieves irrelevant? Do we have the right documents to answer the question?
Does the response match the provided documents?

Phoenix supports retrievals troubleshooting and evaluation on both traces and inferences, but inferences are currently required to visualize your retrievals using a UMAP. See below on the differences.

Feature

Traces & Spans

Inferences

Troubleshooting for LLM applications

✅

Follow the entirety of an LLM workflow

✅

🚫 support for spans only

Embeddings Visualizer

🚧 on the roadmap

✅

PreviousCustom Task Evaluation NextQuickstart: Retrieval

Last updated 1 year ago

Was this helpful?