Arize-ai/phoenix
AI Observability & Evaluation
Provides OpenTelemetry-based tracing, LLM-powered evaluation, versioned datasets, and experiment tracking across LLM frameworks (LangGraph, LlamaIndex, Claude/OpenAI agent SDKs) and providers. Features a web UI with prompt optimization playground, dataset management, and call replay capabilities. Runs locally, in notebooks, or containerized with Helm support, and integrates via auto-instrumentation through the OpenInference standard.
8,847 stars and 1,013,605 monthly downloads. Used by 7 other packages. Actively maintained with 330 commits in the last 30 days. Available on PyPI.
Stars
8,847
Forks
753
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 13, 2026
Monthly downloads
1,013,605
Commits (30d)
330
Dependencies
46
Reverse dependents
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/Arize-ai/phoenix"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
langfuse/langfuse
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management,...
Mirascope/mirascope
The LLM Anti-Framework
Helicone/helicone
🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓
Agenta-AI/agenta
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM...
algorithmicsuperintelligence/optillm
Optimizing inference proxy for LLMs