langfuse and agenta

These are direct competitors offering overlapping LLMOps functionality (observability, prompt management, evaluation, playground), though Langfuse has significantly broader integration support and adoption while Agenta appears to be a more self-contained platform.

langfuse
95
Verified
agenta
72
Verified
Maintenance 25/25
Adoption 25/25
Maturity 25/25
Community 20/25
Maintenance 25/25
Adoption 10/25
Maturity 16/25
Community 21/25
Stars: 23,106
Forks: 2,333
Downloads: 3,912,905
Commits (30d): 240
Language: TypeScript
License:
Stars: 3,923
Forks: 492
Downloads:
Commits (30d): 731
Language: TypeScript
License:
No risk flags
No Package No Dependents

About langfuse

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

Provides distributed tracing via SDKs (Python, JavaScript/TypeScript) that capture full LLM call chains with automatic context propagation, backed by ClickHouse for scalable analytics. Features a unified API surface for programmatic access to traces, evaluations, and datasets, enabling custom workflows and integration into existing MLOps pipelines alongside LangChain, LlamaIndex, and other frameworks.

About agenta

Agenta-AI/agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

Supports 50+ LLM models with bring-your-own model capabilities, and includes OpenTelemetry-native tracing for production observability compatible with OpenLLMetry and OpenInference standards. Features version-controlled prompt management with branching and environments, alongside flexible evaluation via 20+ pre-built evaluators, LLM-as-judge, and custom evaluators accessible through both UI and programmatic APIs. Self-hostable via Docker Compose with multi-environment support and integrations for major LLM providers and frameworks.

Scores updated daily from GitHub, PyPI, and npm data. How scores work