AI Debate Arenas LLM Tools
Interactive platforms where AI models or humans compete in structured debates with scoring/judging. Includes multi-model comparison tools, live debate staging, and adversarial benchmarking. Does NOT include general competitive game frameworks, coding competitions, or security red-team tools without debate mechanics.
There are 42 ai debate arenas tools tracked. The highest-rated is betagouv/ComparIA at 48/100 with 63 stars.
Get all 42 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=ai-debate-arenas&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
betagouv/ComparIA
Open source LLM arena created by the French Government |
|
Emerging |
| 2 |
liuxiaotong/ai-dataset-radar
Multi-source async competitive intelligence engine for AI training data... |
|
Emerging |
| 3 |
Skytliang/Multi-Agents-Debate
MAD: The first work to explore Multi-Agent Debate with Large Language Models :D |
|
Emerging |
| 4 |
llm-ring/lmring
Open-source, self-hostable LLM arena with model compare, voting, and leaderboards |
|
Emerging |
| 5 |
Arnoldlarry15/ARES-Dashboard
AI Red Team Operations Console |
|
Emerging |
| 6 |
khoren93/ai-debates
Orchestrate epic battles between 600+ AI models (GPT-5, Gemini 3, DeepSeek... |
|
Emerging |
| 7 |
OpenLLM-Council/dev-council
An experimental framework for building collaborative coding agents that... |
|
Experimental |
| 8 |
YerbaPage/SWE-Debate
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution |
|
Experimental |
| 9 |
rd-serendipity/ai-debate-arena
AI Debate Arena: Streamlit app for AI model debates. Features multi-model... |
|
Experimental |
| 10 |
jhammant/agent-drift
Stress-test AI agents for goal drift and system prompt violations. Inspired... |
|
Experimental |
| 11 |
debate/debate-ai.com
Debate AI enables debaters in PF, LD & Policy to streamline AI research... |
|
Experimental |
| 12 |
jeremysball/go-pair
generate adversarial prompts for LLMs |
|
Experimental |
| 13 |
aramirez-maza/beto-framework
BETO formalizes the ignorance of an AI. Materializes raw semantic intent... |
|
Experimental |
| 14 |
13120740298z-lang/AI-Tech-Radar
🤖 Automated AI technology intelligence platform — tracks GitHub AI projects,... |
|
Experimental |
| 15 |
sanifhimani/llm-colosseum
AI models fight each other in a pixel arena every day. They decide what to... |
|
Experimental |
| 16 |
lukeslp/consensus
Multi-model research debate engine — 8+ language models independently... |
|
Experimental |
| 17 |
qingni/TechSentry
🛡️ AI-powered tech intelligence tool that auto-tracks GitHub repos, Hacker... |
|
Experimental |
| 18 |
dinesh-git17/debate-lab
Watch AI models debate any topic in real-time. ChatGPT and Grok argue,... |
|
Experimental |
| 19 |
homeofe/ai-security-arena
AI Security Arena: Interactive web interface for AI-powered Red Team vs Blue... |
|
Experimental |
| 20 |
Aimaghsoodi/FailSafe
AI Failure Intelligence — predict, detect, and recover from AI agent failures |
|
Experimental |
| 21 |
Firmislabs/ai-inventory
Detect AI frameworks, LLM dependencies, and model files in any project. One... |
|
Experimental |
| 22 |
the-meta-value/The-Perfect-Storm
tracking AI system failures with solar weather |
|
Experimental |
| 23 |
florykhan/TelusGuardAI
AI-powered network impact analyzer. Natural-language queries → multi-agent... |
|
Experimental |
| 24 |
EdouardZemb/test-framework
CLI for TRA QA process optimization: Jira/Squash triage, testability... |
|
Experimental |
| 25 |
lechmazur/debate
Adversarial multi-turn benchmark for LLM debate quality, using side-swapped... |
|
Experimental |
| 26 |
netlify/nextjs-sentinel
Monitors Next.js releases for relevance to Netlify |
|
Experimental |
| 27 |
SurgeCLI/Surge
Surge is a lightweight self-learning AI observability and remediation agent... |
|
Experimental |
| 28 |
ethicals7s/EchoArena
Local LLM debate arena — make two models battle any topic offline, third... |
|
Experimental |
| 29 |
bassrehab/artemis-agents
A production-ready framework for structured multi-agent debates with... |
|
Experimental |
| 30 |
ayushdwivedi001/AI-Radar-Pulse
An autonomous high-availability framework for tracking, ranking, and... |
|
Experimental |
| 31 |
armsp/AIFU
AI flub ups |
|
Experimental |
| 32 |
Gliangquan/awesome-ai-radar
Daily curated AI, LLM, and agent project radar from GitHub |
|
Experimental |
| 33 |
samuel-dobrancin-qa/ai-content-quality-framework
Structured framework for evaluating AI content generator quality across six... |
|
Experimental |
| 34 |
Rak2k6/gig-audit
Fair Gig Guardian is an AI-powered platform that analyzes gig economy... |
|
Experimental |
| 35 |
gregorydouglasquarles/lavaflow-site
AI-driven predictive safety ecosystem (qView) and multimedia architecture.... |
|
Experimental |
| 36 |
VAMP-NEER/release-radar-ai
🔍 Ultimate Free GitHub Trend Tracker 2026 🚀 | AI-Powered Repo & Dev Team... |
|
Experimental |
| 37 |
Swap-24/ARGUS
Real-time AI debate arena. Argue live against opponents while an ML pipeline... |
|
Experimental |
| 38 |
konradhy/battlearena
A proof of concept illustrating how AI can enhance games |
|
Experimental |
| 39 |
WaterGorilla/agent-arena
Compete with AI agents in a UFC-style coding battle, solving tasks live... |
|
Experimental |
| 40 |
Gouthambalaji03/AI-Debate-Arena
AI Debate Arena is an interactive web experience that stages rapid-fire... |
|
Experimental |
| 41 |
Privacy-Engineering-CMU/ai-risk-prettified
A prettified page for MIT's AI Risk Database |
|
Experimental |
| 42 |
andronov04/aiarena
Client-side AI arena for comparing 1000+ models across 68+ providers. No... |
|
Experimental |