Diffusion RLHF Alignment Diffusion Models

Tools and methods for aligning diffusion models using reinforcement learning and human feedback, including preference optimization, reward modeling, and RLHF fine-tuning techniques. Does NOT include general diffusion model training, inference optimization, or non-RL-based fine-tuning methods like LoRA.

There are 53 diffusion rlhf alignment models tracked. 2 score above 50 (established tier). The highest-rated is FlorianFuerrutter/genQC at 65/100 with 57 stars and 2,712 monthly downloads.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=diffusion-rlhf-alignment&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 FlorianFuerrutter/genQC

Generative Quantum Circuits

65
Established
2 horseee/DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

56
Established
3 Gen-Verse/MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with...

45
Emerging
4 kuleshov-group/mdlm

[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model

42
Emerging
5 Shark-NLP/DiffuSeq

[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

40
Emerging
6 jeongwhanchoi/SCONE

"SCONE: A Novel Stochastic Sampling to Generate Contrastive Views and Hard...

36
Emerging
7 ali-vilab/TeaCache

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

35
Emerging
8 Hzfinfdu/Diffusion-BERT

ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with...

33
Emerging
9 Xiuyu-Li/q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

33
Emerging
10 yjyddq/EOSER-ASS-RL

Official Repository of "Taming Masked Diffusion Language Models via...

33
Emerging
11 yk7333/d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion...

32
Emerging
12 czg1225/AsyncDiff

[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

32
Emerging
13 keshik6/grafting

[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer...

32
Emerging
14 masa-ue/SVDD

Derivative-Free Guidance in Diffusion Models with Soft Value-Based Decoding....

30
Emerging
15 HKUDS/DiffKG

[WSDM'2024 Oral] "DiffKG: Knowledge Graph Diffusion Model for Recommendation"

30
Emerging
16 mapo-t2i/mapo

Official codebase for Margin-aware Preference Optimization for Aligning...

30
Emerging
17 MiZhenxing/ThinkDiff

ICML2025, I Think, Therefore I Diffuse: Enabling Multimodal In-Context...

30
Emerging
18 HKUDS/RecDiff

[CIKM'2024] "RecDiff: Diffusion Model for Social Recommendation"

29
Experimental
19 InternScience/AdaptiveDiffusion

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference...

28
Experimental
20 mihirp1998/AlignProp

AlignProp uses direct reward backpropogation for the alignment of...

28
Experimental
21 Ting-Justin-Jiang/sada-icml

[ICML 2025] Official Repo for Stability-guided Adaptive Diffusion...

28
Experimental
22 basiclab/DiffusionDRO

[NeurIPS 2025] Ranking-based Preference Optimization for Diffusion Models...

28
Experimental
23 AniAggarwal/ecad

[ICLR 2026] Code for Evolutionary Caching to Accelerate Your Off-the-Shelf...

27
Experimental
24 H-EmbodVis/EasyCache

Less is Enough: Training-Free Video Diffusion Acceleration via...

27
Experimental
25 hu-zijing/B2-DiffuRL

[CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.

26
Experimental
26 aaron-di/CDM-PSL

[AAAI 2025] CDM-PSL: Expensive Multi-Objective Bayesian Optimization Based...

26
Experimental
27 ZiyiZhang27/sdpo

[IEEE TPAMI] Code for the paper "Aligning Few-Step Diffusion Models with...

25
Experimental
28 zihaowu25/InvarDiff

InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

25
Experimental
29 AIDC-AI/Diffusion-SDPO

Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models

25
Experimental
30 BIT-DA/DUSA

[NeurIPS 2024] Exploring Structured Semantic Priors Underlying Diffusion...

23
Experimental
31 ilog-ecnu/CDM-PSL

[AAAI 2025] CDM-PSL: Expensive Multi-Objective Bayesian Optimization Based...

23
Experimental
32 yjyddq/rho-EOS

Official Repository of "ρ-𝙴𝙾𝚂: Training-free Bidirectional Variable-Length...

23
Experimental
33 hu-zijing/D-Fusion

[ICML 25] Denoising trajectory fusion, a method to construct RL-trainable...

23
Experimental
34 ModelTC/HarmoniCa

[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa:...

22
Experimental
35 L-YeZhu/BoundaryDiffusion

[NeurIPS2023] BoundaryDiffusion: A learning-free method for semantic control...

22
Experimental
36 user683/CausalDiffRec

[WWW'25]The official implementation of Graph Representation Learning via...

22
Experimental
37 OptiSys-ZJU/segquant

[CVPR '26] A Semantics-Aware and Generalizable Quantization Framework for...

21
Experimental
38 UCF-CRCV/core

Context-Robust Remasking for Diffusion Language Models

21
Experimental
39 masa-ue/RLfinetuning_Diffusion_Bioseq

Code for the tutorial/review paper for RL-based-fine-tuniing. In this code,...

21
Experimental
40 itsluckysharma01/RL-based_Adaptive_Game_Difficulty_Engine

This repository contains an implementation of a 🏗️Reinforcement Learning...

21
Experimental
41 akashsonowal/ddpo-pytorch

RLHF for Stable Diffusion

20
Experimental
42 ZiyiZhang27/tdpo

[ICML 2024] Code for the paper "Confronting Reward Overoptimization for...

19
Experimental
43 federicobrancasi/quantdiff-paper

Research paper: QuantDiff - Efficient Mixed-Precision Quantization for...

19
Experimental
44 HKUDS/DiffMM

[ACM MM'2024]"DiffMM: Multi-Modal Diffusion Model for Recommendation"

18
Experimental
45 suinleelab/An-Efficient-Framework-for-Crediting-Data-Contributors-of-Diffusion-Models

[ICLR2025] An Efficient Framework for Crediting Data Contributors of Diffusion Models

17
Experimental
46 horseee/learning-to-cache

[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via...

16
Experimental
47 zhaoyl18/SEIKO

SEIKO is a novel reinforcement learning method to efficiently fine-tune...

16
Experimental
48 THU-AccDiff/xslim

Official implementation of X-Slim(xslim): Accelerating diffusion model via...

15
Experimental
49 LemonTwoL/ReNeg

ReNeg: Learning Negative Embedding with Reward Guidance

14
Experimental
50 sahsaeedi/DCPO-T2I

[TMLR] Dual Caption Preference Optimization

14
Experimental
51 Yeez-lee/Data-Selection-and-Reweighting-for-Diffusion-Models

[ICASSP 25'] Pruning then Reweighting: Towards Data-Efficient Training of...

13
Experimental
52 LIUTIGHE/HetCache

[CVPR'26] The official implementation of paper "Accelerating Diffusion-based...

11
Experimental
53 arthur-x/AlmostPerfect

Simple end-to-end RLHF (Reinforcement Learning from Human Feedback) for...

10
Experimental