👩🏫AI Research
Open Source Research from the Arize AI Research Team
Last updated
Open Source Research from the Arize AI Research Team
Last updated
Copyright © 2023 Arize AI, Inc
We benchmarked o1-preview on our hardest eval task - time series trend evaluations. This post compares that performance against GPT-4o-mini, Claude 3.5 sonnet, and GPT-4o.
We compare the performance and cost savings of prompt caching on Anthropic vs OpenAI.
We compare and contrast OpenAI's experimental Swarm repo against other popular multi-agent frameworks: Autogen and CrewAI
Lessons learned from our journey to one million downloads of our OpenTelemetry wrapper, OpenInference.
We built the same agent in LangGraph, LlamaIndex Workflows, CrewAI, Autogen, and pure code. See how each framework compares.
Testing the generation stage of RAG across GPT-4 and Claude 2.1.