tongye98/Awesome-Code-Benchmark

A comprehensive code domain benchmark review of LLM researches.

40
/ 100
Emerging

Curates and tracks emerging code-specific benchmarks across diverse evaluation dimensions—from code generation and security analysis to repository-level reasoning and multi-turn interactions. Aggregates peer-reviewed research benchmarks with structured categorization by task type, capability tested, and source institution, enabling systematic comparison of LLM performance across hundreds of specialized evaluation datasets. Actively maintains featured benchmark listings covering recent advances in performance optimization, code translation efficiency, agent-based task solving, and multi-modal code understanding.

208 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

208

Forks

16

Language

License

MIT

Last pushed

Sep 22, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ai-coding/tongye98/Awesome-Code-Benchmark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.