AI Debate Arenas LLM Tools

Interactive platforms where AI models or humans compete in structured debates with scoring/judging. Includes multi-model comparison tools, live debate staging, and adversarial benchmarking. Does NOT include general competitive game frameworks, coding competitions, or security red-team tools without debate mechanics.

There are 42 ai debate arenas tools tracked. The highest-rated is betagouv/ComparIA at 48/100 with 63 stars.

Get all 42 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=ai-debate-arenas&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 betagouv/ComparIA

Open source LLM arena created by the French Government

48
Emerging
2 liuxiaotong/ai-dataset-radar

Multi-source async competitive intelligence engine for AI training data...

46
Emerging
3 Skytliang/Multi-Agents-Debate

MAD: The first work to explore Multi-Agent Debate with Large Language Models :D

42
Emerging
4 llm-ring/lmring

Open-source, self-hostable LLM arena with model compare, voting, and leaderboards

40
Emerging
5 Arnoldlarry15/ARES-Dashboard

AI Red Team Operations Console

40
Emerging
6 khoren93/ai-debates

Orchestrate epic battles between 600+ AI models (GPT-5, Gemini 3, DeepSeek...

31
Emerging
7 OpenLLM-Council/dev-council

An experimental framework for building collaborative coding agents that...

29
Experimental
8 YerbaPage/SWE-Debate

SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

29
Experimental
9 rd-serendipity/ai-debate-arena

AI Debate Arena: Streamlit app for AI model debates. Features multi-model...

26
Experimental
10 jhammant/agent-drift

Stress-test AI agents for goal drift and system prompt violations. Inspired...

26
Experimental
11 debate/debate-ai.com

Debate AI enables debaters in PF, LD & Policy to streamline AI research...

26
Experimental
12 jeremysball/go-pair

generate adversarial prompts for LLMs

25
Experimental
13 aramirez-maza/beto-framework

BETO formalizes the ignorance of an AI. Materializes raw semantic intent...

24
Experimental
14 13120740298z-lang/AI-Tech-Radar

🤖 Automated AI technology intelligence platform — tracks GitHub AI projects,...

23
Experimental
15 sanifhimani/llm-colosseum

AI models fight each other in a pixel arena every day. They decide what to...

23
Experimental
16 lukeslp/consensus

Multi-model research debate engine — 8+ language models independently...

23
Experimental
17 qingni/TechSentry

🛡️ AI-powered tech intelligence tool that auto-tracks GitHub repos, Hacker...

22
Experimental
18 dinesh-git17/debate-lab

Watch AI models debate any topic in real-time. ChatGPT and Grok argue,...

22
Experimental
19 homeofe/ai-security-arena

AI Security Arena: Interactive web interface for AI-powered Red Team vs Blue...

22
Experimental
20 Aimaghsoodi/FailSafe

AI Failure Intelligence — predict, detect, and recover from AI agent failures

22
Experimental
21 Firmislabs/ai-inventory

Detect AI frameworks, LLM dependencies, and model files in any project. One...

22
Experimental
22 the-meta-value/The-Perfect-Storm

tracking AI system failures with solar weather

20
Experimental
23 florykhan/TelusGuardAI

AI-powered network impact analyzer. Natural-language queries → multi-agent...

19
Experimental
24 EdouardZemb/test-framework

CLI for TRA QA process optimization: Jira/Squash triage, testability...

19
Experimental
25 lechmazur/debate

Adversarial multi-turn benchmark for LLM debate quality, using side-swapped...

18
Experimental
26 netlify/nextjs-sentinel

Monitors Next.js releases for relevance to Netlify

17
Experimental
27 SurgeCLI/Surge

Surge is a lightweight self-learning AI observability and remediation agent...

17
Experimental
28 ethicals7s/EchoArena

Local LLM debate arena — make two models battle any topic offline, third...

15
Experimental
29 bassrehab/artemis-agents

A production-ready framework for structured multi-agent debates with...

15
Experimental
30 ayushdwivedi001/AI-Radar-Pulse

An autonomous high-availability framework for tracking, ranking, and...

15
Experimental
31 armsp/AIFU

AI flub ups

14
Experimental
32 Gliangquan/awesome-ai-radar

Daily curated AI, LLM, and agent project radar from GitHub

14
Experimental
33 samuel-dobrancin-qa/ai-content-quality-framework

Structured framework for evaluating AI content generator quality across six...

14
Experimental
34 Rak2k6/gig-audit

Fair Gig Guardian is an AI-powered platform that analyzes gig economy...

14
Experimental
35 gregorydouglasquarles/lavaflow-site

AI-driven predictive safety ecosystem (qView) and multimedia architecture....

14
Experimental
36 VAMP-NEER/release-radar-ai

🔍 Ultimate Free GitHub Trend Tracker 2026 🚀 | AI-Powered Repo & Dev Team...

14
Experimental
37 Swap-24/ARGUS

Real-time AI debate arena. Argue live against opponents while an ML pipeline...

14
Experimental
38 konradhy/battlearena

A proof of concept illustrating how AI can enhance games

14
Experimental
39 WaterGorilla/agent-arena

Compete with AI agents in a UFC-style coding battle, solving tasks live...

11
Experimental
40 Gouthambalaji03/AI-Debate-Arena

AI Debate Arena is an interactive web experience that stages rapid-fire...

11
Experimental
41 Privacy-Engineering-CMU/ai-risk-prettified

A prettified page for MIT's AI Risk Database

11
Experimental
42 andronov04/aiarena

Client-side AI arena for comparing 1000+ models across 68+ providers. No...

10
Experimental