The AI Evals Directory

Quality-scored directory of 0 ai evaluation tools, updated daily. Every tool scored on maintenance, adoption, maturity, and community signals.

Tools for evaluating, benchmarking, and observing AI systems — from LLM eval harnesses to production observability platforms like Langfuse and LangSmith.