Code Model Training AI Coding Tools

Tools and frameworks for pre-training, fine-tuning, and optimizing language models specifically for code generation and programming tasks. Does NOT include inference-only tools, deployment platforms, or general LLM training frameworks.

There are 76 code model training tools tracked. 1 score above 50 (established tier). The highest-rated is k4black/codebleu at 58/100 with 130 stars and 5,089 monthly downloads.

Get all 76 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ai-coding&subcategory=code-model-training&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 k4black/codebleu

Pip compatible CodeBLEU metric implementation available for linux/macos/win

58
Established
2 LiveCodeBench/LiveCodeBench

Official repository for the paper "LiveCodeBench: Holistic and Contamination...

46
Emerging
3 EdinburghNLP/code-docstring-corpus

Preprocessed Python functions and docstrings for automated code...

41
Emerging
4 AS-SiliconMind/SiliconMind-V1

Inference Engine for SiliconMind-V1 Verilog Coding Models

41
Emerging
5 hendrycks/apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

39
Emerging
6 solis-team/Hydra

[FSE 2026] Do Not Treat Code as Natural Language: Implications for...

38
Emerging
7 alxschwrz/codex_py2cpp

Converts python code into c++ by using OpenAI CODEX.

36
Emerging
8 reddy-lab-code-research/PPOCoder

Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation...

33
Emerging
9 tongye98/Awesome-Code-Benchmark

A comprehensive code domain benchmark review of LLM researches.

33
Emerging
10 bharathsudharsan/OTA-TinyML

Code for IEEE Internet Computing Journal paper 'OTA-TinyML: Over the Air...

32
Emerging
11 logpai/LogBench

A benchmark for logging statement generation.

31
Emerging
12 s2e-lab/Code-Smell-Code-Generation

Source code for "An Empirical Study of Code Smells in Transformer-based Code...

30
Emerging
13 JHansiduYapa/Fine-Tuning-a-Small-Language-Model-for-Cypher-Query-Generation

This project fine-tunes Unsloth's Gemma-3 4B IT (4-bit) model to translate...

30
Emerging
14 zorazrw/odex

[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation

29
Experimental
15 vl2g/floco

Flow Chart Image-to-Code Generation

28
Experimental
16 code-gen/cscg

Code Generation as a Dual Task of Code Summarization.

28
Experimental
17 99EnriqueD/verilog_autocompletion

Code implementation for "A Deep Learning Framework for Verilog...

28
Experimental
18 CloudIDEaaS-zz/hydra

Hydra is a app generation product. Hydra aims to reduce the "concept to...

28
Experimental
19 devashish-gupta/Geode

A zero-shot geospatial question answering agent with precise spatiotemporal...

27
Experimental
20 Gen-Verse/CURE

[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via...

27
Experimental
21 s2e-lab/SecurityEval

Repository for "SecurityEval Dataset: Mining Vulnerability Examples to...

27
Experimental
22 matlab-deep-learning/Deep_Learning_Poker_Player_using_MATLAB_and_Raspberry_Pi

This example shows how to use automatic code generation to deploy a deep...

26
Experimental
23 martin-wey/cl-code-apis

Replication package of the paper "On the Usage of Continual Learning for...

26
Experimental
24 formula-code/terminal-bench

Evaluation harness for FormulaCode

25
Experimental
25 madaan/pie-perf

Training language models to make programs faster

25
Experimental
26 formula-code/fc-eval

Evaluation harness for FormulaCode

25
Experimental
27 WebPAI/Interaction2Code

[ASE 2025] Benchmarking MLLM-based Interactive Webpage Code Generation from...

24
Experimental
28 Pavansomisetty21/Automated-Code-Generation-and-Execution-Agent-using-LangChain-and-Cohere-LLM

In this we implement an agent which generates and executes code using cohere...

23
Experimental
29 adpena/vertigo-lora

Domain-specialized LoRA fine-tuning pipeline for Roblox/Luau code generation...

23
Experimental
30 skpig/MPSC

[ACL 2024] Enhancing Large Language Models in Coding Through...

23
Experimental
31 matthewdeanmartin/paipi

Pypi search, except the backend is an LLM's pixelated memory of Pypi.

23
Experimental
32 yunbow/ai-dev-os-benchmark

Benchmark: how AI coding guidelines affect code quality — 3 conditions × 9...

23
Experimental
33 HIT-SCIR/Abacus

珠算代码大模型(Abacus Code LLM)

22
Experimental
34 HySonLab/Design2Code

Large Language Model in combination with Large Vision Model for the task of...

22
Experimental
35 kroq86/honeybadger

formal VM benchmark and inspectable reasoning runtime for testing whether...

22
Experimental
36 carlos-life/OpenEvolve

Evolve algorithms with LLMs. Open-source AlphaEvolve alternative. Uses...

22
Experimental
37 sephirxth/LLM_code_test

LLM code generation benchmark — Claude vs Gemini vs DeepSeek vs Grok on a...

22
Experimental
38 Meisdy/Speech-to-Code-Generation-for-Collaborative-Robots

A modular pipeline that lets users program collaborative robots through...

22
Experimental
39 Rudra5417/Code-Generator-using-GPT-3

Natural Language to Code

22
Experimental
40 Training-Datasmith/olmo3-code-150m-pretrain

Pre-training a ~150M parameter code-specialized language model using OLMo 3...

22
Experimental
41 aswathselvam/Potholes

Realtime pothole detection on Android phone's IMU data. SVM model in C++, ...

22
Experimental
42 sanskar9999/CodeEvolveLLM

A framework for using local LLMs (Qwen2.5-coder 7B) that are fine-tuned...

21
Experimental
43 aixcoder-plugin/nl2code-dataset

Aix-bench, the Java benchmark for code synthesis problem.

20
Experimental
44 domaineval/DomainEval

DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation...

20
Experimental
45 KohlerHECTOR/interpreter-py

Implementation of Interpretable and Editable Programmatic Tree Policies for...

20
Experimental
46 jszheng21/RACE

RACE is a multi-dimensional benchmark for code generation that focuses on...

20
Experimental
47 VaibhavYadav/pytorch_pix2code

A pytorch Implementation of pix2code

20
Experimental
48 albertusk95/intention-to-code-lstm

Source Code Generation Based On User Intention Using LSTM Networks

19
Experimental
49 seal-research/OmniCode

OmniCode: A Diverse Software Engineering Benchmark for Evaluating Large...

19
Experimental
50 medxiaorudan/CodeGeneration

Prompt engineering with Langchain and fine-tuning the CodeLlama model. The...

18
Experimental
51 CodeEff/ECCO

[EMNLP 2024] Code for the paper "ECCO: Can We Improve Model-Generated Code...

18
Experimental
52 LiuZeJie97/Code-Generation-From-Flowcharts-with-Texts-A-Benchmark-Dataset-and-An-Approach

Code for the paper "Code Generation From Flowcharts with Texts: A Benchmark...

17
Experimental
53 ftrou/Decodifier

**The Compiler for AI-Generated Software** **LLMs don’t write code.** ...

16
Experimental
54 AngelicaArabe/OTA-IOT

🔧 Develop IoT applications with ESP32-S3 using OTA updates, SPIFFS web...

16
Experimental
55 ameerkhan9394/ide-ai-benchmark

🚀 Evaluate and compare AI models across multiple IDEs with a comprehensive...

15
Experimental
56 LIANGQINGYUAN/Lyra

Lyra: A Benchmark for Turducken-Style Code Generation

15
Experimental
57 PAN001/LeToRr

LeToRr: Learning to Re-rank with Application in Code Generation

14
Experimental
58 cloudrishi/springboot-ai-generator

AI-powered Spring Boot code generator using CodeLlama LLM running locally via Ollama

14
Experimental
59 dakshjain-1616/nemotron3-super-vs-gpt5.4-nano

Head-to-head benchmark comparing Nemotron and GPT-5.4-nano on code generation tasks

14
Experimental
60 ALM3ARQ/character-prefix-conditioning

🔍 Streamline token sampling with character prefix conditioning using a...

14
Experimental
61 ada994/prism-bench

🌐 Benchmark models using the PRISM framework and access the FLUX-Reason-6M...

14
Experimental
62 jacopotagliabue/LLMs-to-Alloy

Example of LLM generated Alloy code for deductive reasoning from English...

14
Experimental
63 yueyueL/ReliableLM4Code

Collections of research, benchmarks and tools towards more robust and...

14
Experimental
64 przeprogramowani/10x-bench-eval

Scoring criteria for 10x-bench (10xbench.ai)

13
Experimental
65 sssszh/CodePLAN

The code repository for the paper “Enhancing Code Generation Performance of...

13
Experimental
66 kabirjaipal/Evil-Codes

Evil Codes is a repository where you will find many useful code snippets and...

13
Experimental
67 Bifrost-Technologies/Prometheus

A developer platform for generating complete Solana programs in one-shot...

13
Experimental
68 betterenvi/open-dataset

Links to awesome open dataset.

12
Experimental
69 evalops/llmcc

LLM-native compiler toolchain - implementing 'LLM ≈ probabilistic compiler'...

12
Experimental
70 AshrafMorningstar/omni-code-polyglot

A massive, SEO‑optimized collection of 300+ ready‑to‑run code snippets in...

12
Experimental
71 rajat-kumar-thakur/LLMs-for-Resource-Constrained-Devices

This work was done as part of SRIP 2025 Internship, IIT Gandhinagar

11
Experimental
72 navneetprabhakar/telegram-bot-llm

Telegram bot with LLM code gen capabilities

11
Experimental
73 runaicode/ai-coding-benchmarks

Standardized test prompts and benchmarks for evaluating AI coding...

11
Experimental
74 gokhanercan/gen-atomic

An LLM-based code generation framework aims to support a wide range of...

11
Experimental
75 falconvn2006/GPasT

GPT for Pascal code generation :)

11
Experimental
76 motazsaad/Natural-Language-to-Python

Natural Language to Python code Translation

10
Experimental