Code Model Training AI Coding Tools
Tools and frameworks for pre-training, fine-tuning, and optimizing language models specifically for code generation and programming tasks. Does NOT include inference-only tools, deployment platforms, or general LLM training frameworks.
There are 76 code model training tools tracked. 1 score above 50 (established tier). The highest-rated is k4black/codebleu at 58/100 with 130 stars and 5,089 monthly downloads.
Get all 76 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ai-coding&subcategory=code-model-training&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
k4black/codebleu
Pip compatible CodeBLEU metric implementation available for linux/macos/win |
|
Established |
| 2 |
LiveCodeBench/LiveCodeBench
Official repository for the paper "LiveCodeBench: Holistic and Contamination... |
|
Emerging |
| 3 |
EdinburghNLP/code-docstring-corpus
Preprocessed Python functions and docstrings for automated code... |
|
Emerging |
| 4 |
AS-SiliconMind/SiliconMind-V1
Inference Engine for SiliconMind-V1 Verilog Coding Models |
|
Emerging |
| 5 |
hendrycks/apps
APPS: Automated Programming Progress Standard (NeurIPS 2021) |
|
Emerging |
| 6 |
solis-team/Hydra
[FSE 2026] Do Not Treat Code as Natural Language: Implications for... |
|
Emerging |
| 7 |
alxschwrz/codex_py2cpp
Converts python code into c++ by using OpenAI CODEX. |
|
Emerging |
| 8 |
reddy-lab-code-research/PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation... |
|
Emerging |
| 9 |
tongye98/Awesome-Code-Benchmark
A comprehensive code domain benchmark review of LLM researches. |
|
Emerging |
| 10 |
bharathsudharsan/OTA-TinyML
Code for IEEE Internet Computing Journal paper 'OTA-TinyML: Over the Air... |
|
Emerging |
| 11 |
logpai/LogBench
A benchmark for logging statement generation. |
|
Emerging |
| 12 |
s2e-lab/Code-Smell-Code-Generation
Source code for "An Empirical Study of Code Smells in Transformer-based Code... |
|
Emerging |
| 13 |
JHansiduYapa/Fine-Tuning-a-Small-Language-Model-for-Cypher-Query-Generation
This project fine-tunes Unsloth's Gemma-3 4B IT (4-bit) model to translate... |
|
Emerging |
| 14 |
zorazrw/odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation |
|
Experimental |
| 15 |
vl2g/floco
Flow Chart Image-to-Code Generation |
|
Experimental |
| 16 |
code-gen/cscg
Code Generation as a Dual Task of Code Summarization. |
|
Experimental |
| 17 |
99EnriqueD/verilog_autocompletion
Code implementation for "A Deep Learning Framework for Verilog... |
|
Experimental |
| 18 |
CloudIDEaaS-zz/hydra
Hydra is a app generation product. Hydra aims to reduce the "concept to... |
|
Experimental |
| 19 |
devashish-gupta/Geode
A zero-shot geospatial question answering agent with precise spatiotemporal... |
|
Experimental |
| 20 |
Gen-Verse/CURE
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via... |
|
Experimental |
| 21 |
s2e-lab/SecurityEval
Repository for "SecurityEval Dataset: Mining Vulnerability Examples to... |
|
Experimental |
| 22 |
matlab-deep-learning/Deep_Learning_Poker_Player_using_MATLAB_and_Raspberry_Pi
This example shows how to use automatic code generation to deploy a deep... |
|
Experimental |
| 23 |
martin-wey/cl-code-apis
Replication package of the paper "On the Usage of Continual Learning for... |
|
Experimental |
| 24 |
formula-code/terminal-bench
Evaluation harness for FormulaCode |
|
Experimental |
| 25 |
madaan/pie-perf
Training language models to make programs faster |
|
Experimental |
| 26 |
formula-code/fc-eval
Evaluation harness for FormulaCode |
|
Experimental |
| 27 |
WebPAI/Interaction2Code
[ASE 2025] Benchmarking MLLM-based Interactive Webpage Code Generation from... |
|
Experimental |
| 28 |
Pavansomisetty21/Automated-Code-Generation-and-Execution-Agent-using-LangChain-and-Cohere-LLM
In this we implement an agent which generates and executes code using cohere... |
|
Experimental |
| 29 |
adpena/vertigo-lora
Domain-specialized LoRA fine-tuning pipeline for Roblox/Luau code generation... |
|
Experimental |
| 30 |
skpig/MPSC
[ACL 2024] Enhancing Large Language Models in Coding Through... |
|
Experimental |
| 31 |
matthewdeanmartin/paipi
Pypi search, except the backend is an LLM's pixelated memory of Pypi. |
|
Experimental |
| 32 |
yunbow/ai-dev-os-benchmark
Benchmark: how AI coding guidelines affect code quality — 3 conditions × 9... |
|
Experimental |
| 33 |
HIT-SCIR/Abacus
珠算代码大模型(Abacus Code LLM) |
|
Experimental |
| 34 |
HySonLab/Design2Code
Large Language Model in combination with Large Vision Model for the task of... |
|
Experimental |
| 35 |
kroq86/honeybadger
formal VM benchmark and inspectable reasoning runtime for testing whether... |
|
Experimental |
| 36 |
carlos-life/OpenEvolve
Evolve algorithms with LLMs. Open-source AlphaEvolve alternative. Uses... |
|
Experimental |
| 37 |
sephirxth/LLM_code_test
LLM code generation benchmark — Claude vs Gemini vs DeepSeek vs Grok on a... |
|
Experimental |
| 38 |
Meisdy/Speech-to-Code-Generation-for-Collaborative-Robots
A modular pipeline that lets users program collaborative robots through... |
|
Experimental |
| 39 |
Rudra5417/Code-Generator-using-GPT-3
Natural Language to Code |
|
Experimental |
| 40 |
Training-Datasmith/olmo3-code-150m-pretrain
Pre-training a ~150M parameter code-specialized language model using OLMo 3... |
|
Experimental |
| 41 |
aswathselvam/Potholes
Realtime pothole detection on Android phone's IMU data. SVM model in C++, ... |
|
Experimental |
| 42 |
sanskar9999/CodeEvolveLLM
A framework for using local LLMs (Qwen2.5-coder 7B) that are fine-tuned... |
|
Experimental |
| 43 |
aixcoder-plugin/nl2code-dataset
Aix-bench, the Java benchmark for code synthesis problem. |
|
Experimental |
| 44 |
domaineval/DomainEval
DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation... |
|
Experimental |
| 45 |
KohlerHECTOR/interpreter-py
Implementation of Interpretable and Editable Programmatic Tree Policies for... |
|
Experimental |
| 46 |
jszheng21/RACE
RACE is a multi-dimensional benchmark for code generation that focuses on... |
|
Experimental |
| 47 |
VaibhavYadav/pytorch_pix2code
A pytorch Implementation of pix2code |
|
Experimental |
| 48 |
albertusk95/intention-to-code-lstm
Source Code Generation Based On User Intention Using LSTM Networks |
|
Experimental |
| 49 |
seal-research/OmniCode
OmniCode: A Diverse Software Engineering Benchmark for Evaluating Large... |
|
Experimental |
| 50 |
medxiaorudan/CodeGeneration
Prompt engineering with Langchain and fine-tuning the CodeLlama model. The... |
|
Experimental |
| 51 |
CodeEff/ECCO
[EMNLP 2024] Code for the paper "ECCO: Can We Improve Model-Generated Code... |
|
Experimental |
| 52 |
LiuZeJie97/Code-Generation-From-Flowcharts-with-Texts-A-Benchmark-Dataset-and-An-Approach
Code for the paper "Code Generation From Flowcharts with Texts: A Benchmark... |
|
Experimental |
| 53 |
ftrou/Decodifier
**The Compiler for AI-Generated Software** **LLMs don’t write code.** ... |
|
Experimental |
| 54 |
AngelicaArabe/OTA-IOT
🔧 Develop IoT applications with ESP32-S3 using OTA updates, SPIFFS web... |
|
Experimental |
| 55 |
ameerkhan9394/ide-ai-benchmark
🚀 Evaluate and compare AI models across multiple IDEs with a comprehensive... |
|
Experimental |
| 56 |
LIANGQINGYUAN/Lyra
Lyra: A Benchmark for Turducken-Style Code Generation |
|
Experimental |
| 57 |
PAN001/LeToRr
LeToRr: Learning to Re-rank with Application in Code Generation |
|
Experimental |
| 58 |
cloudrishi/springboot-ai-generator
AI-powered Spring Boot code generator using CodeLlama LLM running locally via Ollama |
|
Experimental |
| 59 |
dakshjain-1616/nemotron3-super-vs-gpt5.4-nano
Head-to-head benchmark comparing Nemotron and GPT-5.4-nano on code generation tasks |
|
Experimental |
| 60 |
ALM3ARQ/character-prefix-conditioning
🔍 Streamline token sampling with character prefix conditioning using a... |
|
Experimental |
| 61 |
ada994/prism-bench
🌐 Benchmark models using the PRISM framework and access the FLUX-Reason-6M... |
|
Experimental |
| 62 |
jacopotagliabue/LLMs-to-Alloy
Example of LLM generated Alloy code for deductive reasoning from English... |
|
Experimental |
| 63 |
yueyueL/ReliableLM4Code
Collections of research, benchmarks and tools towards more robust and... |
|
Experimental |
| 64 |
przeprogramowani/10x-bench-eval
Scoring criteria for 10x-bench (10xbench.ai) |
|
Experimental |
| 65 |
sssszh/CodePLAN
The code repository for the paper “Enhancing Code Generation Performance of... |
|
Experimental |
| 66 |
kabirjaipal/Evil-Codes
Evil Codes is a repository where you will find many useful code snippets and... |
|
Experimental |
| 67 |
Bifrost-Technologies/Prometheus
A developer platform for generating complete Solana programs in one-shot... |
|
Experimental |
| 68 |
betterenvi/open-dataset
Links to awesome open dataset. |
|
Experimental |
| 69 |
evalops/llmcc
LLM-native compiler toolchain - implementing 'LLM ≈ probabilistic compiler'... |
|
Experimental |
| 70 |
AshrafMorningstar/omni-code-polyglot
A massive, SEO‑optimized collection of 300+ ready‑to‑run code snippets in... |
|
Experimental |
| 71 |
rajat-kumar-thakur/LLMs-for-Resource-Constrained-Devices
This work was done as part of SRIP 2025 Internship, IIT Gandhinagar |
|
Experimental |
| 72 |
navneetprabhakar/telegram-bot-llm
Telegram bot with LLM code gen capabilities |
|
Experimental |
| 73 |
runaicode/ai-coding-benchmarks
Standardized test prompts and benchmarks for evaluating AI coding... |
|
Experimental |
| 74 |
gokhanercan/gen-atomic
An LLM-based code generation framework aims to support a wide range of... |
|
Experimental |
| 75 |
falconvn2006/GPasT
GPT for Pascal code generation :) |
|
Experimental |
| 76 |
motazsaad/Natural-Language-to-Python
Natural Language to Python code Translation |
|
Experimental |