phoenix and helicone

These are **competitors** offering overlapping core functionality—both provide end-to-end LLM observability with logging, monitoring, and evaluation capabilities—though Phoenix has significantly broader adoption (1M+ monthly downloads vs. 346) and a more mature feature set.

phoenix
94
Verified
helicone
81
Verified
Maintenance 25/25
Adoption 25/25
Maturity 25/25
Community 19/25
Maintenance 20/25
Adoption 16/25
Maturity 25/25
Community 20/25
Stars: 8,847
Forks: 753
Downloads: 1,013,605
Commits (30d): 330
Language: Jupyter Notebook
License:
Stars: 5,237
Forks: 494
Downloads: 292
Commits (30d): 7
Language: TypeScript
License: Apache-2.0
No risk flags
No risk flags

About phoenix

Arize-ai/phoenix

AI Observability & Evaluation

Provides OpenTelemetry-based tracing, LLM-powered evaluation, versioned datasets, and experiment tracking across LLM frameworks (LangGraph, LlamaIndex, Claude/OpenAI agent SDKs) and providers. Features a web UI with prompt optimization playground, dataset management, and call replay capabilities. Runs locally, in notebooks, or containerized with Helm support, and integrates via auto-instrumentation through the OpenInference standard.

About helicone

Helicone/helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

Operates as a reverse proxy AI gateway that intercepts requests to 100+ LLM providers through a unified OpenAI-compatible API, enabling intelligent routing and automatic fallbacks. Built on a microservices architecture with a Cloudflare Workers proxy layer for request interception, Express-based collection server (Jawn), ClickHouse for analytics, and Supabase for application data. Integrates with OpenAI, Anthropic, Gemini, LangChain, Vercel AI SDK, and supports self-hosting via Docker or Helm with optional async logging through OpenLLMetry.

Scores updated daily from GitHub, PyPI, and npm data. How scores work