LLM Observability & Monitoring LLM Tools

Tools for observing, tracing, monitoring, and evaluating LLM applications in production. Includes metrics collection, span tracking, performance analysis, and system health dashboards. Does NOT include LLM serving infrastructure, prompt management, or general application logging.

There are 75 llm observability & monitoring tools tracked. 2 score above 70 (verified tier). The highest-rated is apache/hertzbeat at 73/100 with 7,121 stars. 4 of the top 10 are actively maintained.

Get all 75 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-observability-monitoring&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 apache/hertzbeat

An AI-powered next-generation open source real-time observability system.

73
Verified
2 traceloop/openllmetry

Open-source observability for your GenAI or LLM application, based on OpenTelemetry

71
Verified
3 Arize-ai/openinference

OpenTelemetry Instrumentation for AI Observability

69
Established
4 vndee/llm-sandbox

Lightweight and portable LLM sandbox runtime (code interpreter) Python library.

68
Established
5 Scale3-Labs/langtrace-python-sdk

Langtrace SDK for Python Applications

58
Established
6 utkuozdemir/nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

55
Established
7 Dynatrace/obslab-llm-observability

Search for a holiday and get destination advice from an LLM. Observability...

49
Emerging
8 secretflow/trustflow

A privacy-preserving computing system based on TEE.

48
Emerging
9 opensearch-project/observability-stack

Opensearch Observability Stack

43
Emerging
10 openlit/website

Open Source OpenTelemetry-native Observability tool for GenAI and LLMs...

41
Emerging
11 onvo-ai/loghead

Loghead is a tool that allows LLMs in your vibe coding tool to have access...

40
Emerging
12 Ablustrund/MPLSandbox

MPLSandbox is an out-of-the-box multi-programming language sandbox designed...

39
Emerging
13 cchinchilla-dev/agentloom

Deterministic LLM workflow orchestration with native observability,...

39
Emerging
14 clay-good/proxilion-grc

Proxilion GRC is a zero-configuration network-layer MITM proxy that secures...

37
Emerging
15 mazen160/llmquery

Powerful LLM Query Framework with YAML Prompt Templates. Made for Automation

37
Emerging
16 jmamda/OpenTrace

A local reverse proxy that records every LLM request/response to SQLite. No...

36
Emerging
17 HuckleR2003/PC_Workman_HCK

Real-time system monitor that explains WHY your PC is slow, not just that...

36
Emerging
18 chaitanyya/lookout

Track, analyze, and improve what LLMs are saying

34
Emerging
19 aimusubi/aimusubi

Local-first agentic NetOps framework that connects LLMs to real network...

33
Emerging
20 prajeesh-chavan/OpenLLM-Monitor

OpenLLM Monitor is a plug-and-play, real-time observability dashboard for...

32
Emerging
21 copyleftdev/robin-smesh

🕸️ Decentralized Dark Web OSINT Framework | Rust | SMESH Signal Diffusion |...

32
Emerging
22 Scale3-Labs/langtrace-typescript-sdk

Langtrace SDK for NodeJS Applications

31
Emerging
23 demml/scouter

Monitoring, Evaluation and Observability for AI Applications

31
Emerging
24 ZacAttack/HeapDumpStarDiver

Allows for fast parsing of an HPROF file to parquet format so that it can be...

31
Emerging
25 sarva-20/LLM-Observability-FOSS

🧠 Learn LLM Observability step-by-step using FOSS tools. From zero...

29
Experimental
26 mxcrafts/ltrack

Security Observability Framework for ML/AI Model File Loading

29
Experimental
27 langfuse/oss-llmops-stack

Modular, open source LLMOps stack that separates concerns: LiteLLM unifies...

29
Experimental
28 raaihank/llm-sentinel

Privacy-first proxy that automatically detects and masks sensitive data...

28
Experimental
29 eunomia-bpf/ebpf-knowledge-base

An ebpf knowledge base, based on llama_index and bpf-developer-tutorial

27
Experimental
30 eullm/eullm

Open-source platform for creating, distributing and running sovereign...

27
Experimental
31 mithril-security/blind_llama_client

Zero-trust AI APIs for easy and private consumption of open-source LLMs

26
Experimental
32 JehanneDussert/govllm

Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.

25
Experimental
33 eduardoslonski/telescope

Scalable high-performance async RL post-training framework for LLMs with...

24
Experimental
34 Pranavh-2004/DevMonitor

DevMonitor is a real-time developer dashboard that aggregates AI research,...

24
Experimental
35 thoughtbot/opentelemetry-instrumentation-ruby_llm

OpenTelemetry instrumentation for RubyLLM. 💬🔭

24
Experimental
36 JehanneDussert/llm_governance_monitoring

Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.

23
Experimental
37 eduardoslonski/telescope-ui

Real-time observability dashboard for the Telescope RL post-training...

23
Experimental
38 ftaghiyev/firewall-configuration-interface

A Natural Languange Interface for Firewall Configuration

23
Experimental
39 cristianxruvalcaba-coder/tiresias-core

Tiresias core library — shared primitives for the multi-provider LLM proxy...

23
Experimental
40 james-martinez/lemonade-dashboard

A management dashboard for Lemonade Server. This extension provides a visual...

23
Experimental
41 tied-inc/eval-track

LLM-ML-Observability Toolkits and Serivces

23
Experimental
42 broomva/vigil

OpenTelemetry-native observability — GenAI semantic conventions,...

23
Experimental
43 andrewn6/traceway

Traceway: observability for LLM's

23
Experimental
44 Blastgits/traceway

Traceway: observability for LLM's

23
Experimental
45 tyabu12/hamoru

"Terraform for LLMs." Declaratively orchestrate multiple LLM providers in...

22
Experimental
46 AdametherzLab/agent-drift-watch

CLI that snapshots LLM prompt/response pairs and alerts when model behavior...

22
Experimental
47 MoebiusX/KrystalineX

KrystalineX — Institutional-grade crypto exchange demo platform with...

22
Experimental
48 ogulcanaydogan/LLM-SLO-eBPF-Toolkit

eBPF-based SLO observability for LLM inference latency on Kubernetes

22
Experimental
49 brookrunning734/trace-ui

Visualize and analyze large-scale ARM64 execution traces with fast browsing,...

22
Experimental
50 wkusnierczyk/redoxide

High-performance, modular, extensible LLM Red Teaming tool written in Rust.

22
Experimental
51 DiogoRibeiro7/llm-observability-analytics

Observability and analytics layer for LLM systems, capturing...

22
Experimental
52 LakshmiSravyaVedantham/llm-lens

A flight recorder for AI agents — replay every LLM call step-by-step to find...

22
Experimental
53 Skobyn/llm-output-governance

A practical Python toolkit for evaluating, monitoring, and governing LLM...

22
Experimental
54 yeahns278/lemonade-dashboard

Manage Lemonade Server models, backends, and settings within VS Code using a...

22
Experimental
55 GenesisClawbot/llm-drift

LLM drift detector — know within 5 min when GPT-4o, Claude, or Gemini...

22
Experimental
56 kimasplund/LLM-Context-Trace-Library

Time-travel debugging for multi-agent LLM workflows. Replay execution to any...

21
Experimental
57 kingfs/llm-tracelab

A proxy-based tool for tracing, recording, and replaying LLM API requests.

20
Experimental
58 node-llm/node-llm-monitor

Production-grade observability for LLM applications in Node.js.

20
Experimental
59 HelgeSverre/llmflow

Local-first LLM observability. Trace agents, chains, and LLM calls with...

19
Experimental
60 romanmatena/browsermonitor

Browser console and network monitoring for debugging and LLM workflows....

19
Experimental
61 jwilger/union_square

Wire-tap your application's LLM interactions for performance analysis and...

18
Experimental
62 cmangun/llm-observability-dashboards

Prometheus + Grafana observability stack for LLM-powered systems

16
Experimental
63 AjaCHN/LLM-API-Sentinel

全球主流大模型 API 实时监控与历史可用性追踪系统。Real-time monitoring and historical availability...

15
Experimental
64 Scale3-Labs/langtrace-trace-attributes

Trace Attributes for Langtrace

14
Experimental
65 medtotti/nektor

🔍 Generate AI-powered tail-based sampling policies for Honeycomb Refinery...

14
Experimental
66 Ajithkumae/observability-toolkit

Provide a framework to build and test observability tools with integrated...

14
Experimental
67 voynow/maintainability

LLM driven static code analysis for quantifying maintainability [no longer active]

13
Experimental
68 prkbuilds/otel-ai-go

OpenTelemetry GenAI semantic conventions for Go: drop-in HTTP middleware,...

12
Experimental
69 quarktetra23/LLM_staticanalysis

Pylint Code Analaysis for LLM's

12
Experimental
70 sec-view/FluxPeek

FluxPeek is a desktop app for inspecting huge dataset files with...

12
Experimental
71 AbdelStark/dumbmeter

A daily snapshot of when popular models drift from their baseline. Auto...

12
Experimental
72 kodlan/llm-observability-pack

Ready-to-run observability stack for LLM inference servers (vLLM, Triton)...

11
Experimental
73 moondef/vhs

Record, edit, and replay LLM HTTP interactions. Deterministic testing for AI...

11
Experimental
74 dytsou/sodets

SODETS automates production error reproduction by synthesizing complex...

11
Experimental
75 modelmetry/modelmetry-sdk-js

The Modelmetry JS/TS SDK allows developers to easily integrate Modelmetry’s...

10
Experimental