ksm26/Evaluating-AI-Agents

A hands-on course repository for Evaluating AI Agents, created with Arize AI, that teaches you how to systematically evaluate, debug, and improve AI agents using observability tools, structured experiments, and reliable metrics. Learn production-grade techniques to enhance agent performance during development and after deployment.

/ 100

Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 1 / 25

Maturity 1 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Category

llm-observability-platforms

Last pushed

May 12, 2025

Commits (30d)

GitHub

LLM Observability Platforms · 27 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/ksm26/Evaluating-AI-Agents"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

langfuse/langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management,...

Arize-ai/phoenix

AI Observability & Evaluation

Mirascope/mirascope

The LLM Anti-Framework

Helicone/helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

Agenta-AI/agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights