oripress/AlgoTune
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
Includes AlgoTuner, an agentic framework enabling language models to iteratively optimize code through automated compilation and benchmarking. Supports distributed execution via SLURM, AWS Batch with Spot instance optimization, and offline evaluation—datasets stream from HuggingFace or can be generated locally for reproducible benchmark runs.
Stars
95
Forks
13
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/oripress/AlgoTune"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
xjywhu/Awesome-Multimodal-LLM-for-Code
Multimodal Large Language Models for Code Generation under Multimodal Scenarios
juyongjiang/CodeUp
CodeUp: A Multilingual Code Generation Llama-X Model with Parameter-Efficient Instruction-Tuning
Gen-Verse/ReasonFlux
[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and...
jie-jw-wu/human-eval-comm
HumanEvalComm: Evaluating Communication Skill of Code LLM and LLM Agent
amazon-science/llm-code-preference
Training and Benchmarking LLMs for Code Preference.