Compositional T2I Generation Diffusion Models
Tools for enhancing spatial reasoning, multi-concept composition, and fine-grained control in text-to-image diffusion models through architectural improvements and guidance techniques. Does NOT include general T2I generation, LoRA training, or personalization fine-tuning methods.
There are 133 compositional t2i generation models tracked. 3 score above 50 (established tier). The highest-rated is PaddlePaddle/PaddleMIX at 60/100 with 718 stars. 2 of the top 10 are actively maintained.
Get all 133 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=compositional-t2i-generation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream... |
|
Established |
| 2 |
UCSC-VLAA/story-iter
[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization |
|
Established |
| 3 |
keivalya/mini-vla
a minimal, beginner-friendly VLA to show how robot policies can fuse images,... |
|
Established |
| 4 |
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023) |
|
Emerging |
| 5 |
byliutao/1Prompt1Story
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent... |
|
Emerging |
| 6 |
HorizonWind2004/reconstruction-alignment
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves... |
|
Emerging |
| 7 |
mit-han-lab/lpd
[ICLR 2026 Oral] Locality-aware Parallel Decoding for Efficient... |
|
Emerging |
| 8 |
zai-org/ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for... |
|
Emerging |
| 9 |
OpenDriveLab/Nexus
[ICCV 2025] Nexus: Decoupled Diffusion Sparks Adaptive Scene Generation |
|
Emerging |
| 10 |
JyChen9811/FaithDiff
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival,... |
|
Emerging |
| 11 |
foivospar/Arc2Face
[ECCV 2024 Oral 🔥] Arc2Face: A Foundation Model for ID-Consistent Human... |
|
Emerging |
| 12 |
ziqihuangg/Collaborative-Diffusion
[CVPR 2023] Collaborative Diffusion |
|
Emerging |
| 13 |
haoyangzheng-ai/didi-instruct
[ICLR 2026] Discrete Diffusion Divergence Instruct (DiDi-Instruct) |
|
Emerging |
| 14 |
H-EmbodVis/MERGE
[NeurIPS 2025] More Than Generation: Unifying Generation and Depth... |
|
Emerging |
| 15 |
lmxyy/sige
[NeurIPS 2022, T-PAMI 2023] Efficient Spatially Sparse Inference for... |
|
Emerging |
| 16 |
grigorisg9gr/polynomial_nets
Official Implementation of the CVPR'20 paper 'Î -nets: Deep Polynomial Neural... |
|
Emerging |
| 17 |
yandex-research/swd
[ICLR'2026] Scale-wise Distillation of Diffusion Models |
|
Emerging |
| 18 |
YixunLiang/UniTEX
Official implementation of "UniTEX: Universal High Fidelity Generative... |
|
Emerging |
| 19 |
ankanbhunia/PIDM
Person Image Synthesis via Denoising Diffusion Model (CVPR 2023) |
|
Emerging |
| 20 |
bytedance/UNO
[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and... |
|
Emerging |
| 21 |
energy-based-model/Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
[ECCV 2022] Compositional Generation using Diffusion Models |
|
Emerging |
| 22 |
yuval-alaluf/Attend-and-Excite
Official Implementation for "Attend-and-Excite: Attention-Based Semantic... |
|
Emerging |
| 23 |
junkunyuan/NexusAlign
A unified and extensible framework for aligning foundation models. |
|
Emerging |
| 24 |
gudaochangsheng/RefAlign
Official PyTorch implementation of RefAlign: Representation Alignment for... |
|
Emerging |
| 25 |
open-mmlab/PIA
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by... |
|
Emerging |
| 26 |
sihyun-yu/REPA
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion... |
|
Emerging |
| 27 |
WindVChen/Diff-Harmonization
A novel zero-shot image harmonization method based on Diffusion Model Prior. |
|
Emerging |
| 28 |
youngwanLEE/sdxl-koala
[NeurIPS 2024] Empirical Lessons Toward Memory-Efficient and Fast Diffusion... |
|
Emerging |
| 29 |
AlaaLab/InstructCV
[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned... |
|
Emerging |
| 30 |
ExplainableML/ReNO
[NeurIPS 2024] ReNO: Enhancing One-step Text-to-Image Models through... |
|
Emerging |
| 31 |
limuloo/MIGC
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation) |
|
Emerging |
| 32 |
nupurkmr9/concept-ablation
Ablating Concepts in Text-to-Image Diffusion Models (ICCV 2023) |
|
Emerging |
| 33 |
gojasper/flash-diffusion
âš¡ Flash Diffusion âš¡: Accelerating Any Conditional Diffusion Model for Few... |
|
Emerging |
| 34 |
Ammmob/PixelSmile
PixelSmile: Fine-grained facial expression editing with continuous control,... |
|
Emerging |
| 35 |
HVision-NKU/ImageCritic
Official implementation of ImageCritic (CVPR 2026) |
|
Emerging |
| 36 |
M-E-AGI-Lab/PSAlign
Official Implementation of "PSAlign: Personalized Safety Alignment for... |
|
Emerging |
| 37 |
lzyhha/VisualCloze
[ICCV 2025] VisualCloze: A universal image generation framework that can... |
|
Emerging |
| 38 |
CVL-UESTC/Internal-Guidance
CVPR 2026-Guiding a Diffusion Transformer with the Internal Dynamics of Itself (IG) |
|
Emerging |
| 39 |
HKUST-LongGroup/Coarse-guided-Gen
[arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual... |
|
Emerging |
| 40 |
baojudezeze/RMP-Adapter
The implementation of RMP-Adapter: A region-based Multiple Prompt Adapter... |
|
Emerging |
| 41 |
NeuralTextualInversion/NeTI
Official Implementation for "A Neural Space-Time Representation for... |
|
Emerging |
| 42 |
blurgyy/CoMPaSS
[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models |
|
Emerging |
| 43 |
zhiyichin/P4D
[ICML 2024] Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models... |
|
Emerging |
| 44 |
kfirgoldberg/ConceptLab
Official Implementation for "ConceptLab: Creative Generation using Diffusion... |
|
Emerging |
| 45 |
RockeyCoss/SPO
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic... |
|
Emerging |
| 46 |
muzishen/IMAGPose
[NeurIPS 2024] 🕺IMAGPose🕺: A Unified Conditional Framework for Pose-Guided... |
|
Emerging |
| 47 |
ashutosh1919/mdp-diffusion
Text-guided image editing by manipulating diffusion path without any training. |
|
Emerging |
| 48 |
huanngzh/Parts2Whole
[TIP 2025] From Parts to Whole: A Unified Reference Framework for... |
|
Emerging |
| 49 |
aminK8/KnobGen
CVPR 2025 Workshop on CVEU. |
|
Emerging |
| 50 |
VinAIResearch/DiMSUM
DiMSUM: Diffusion Mamba - A Scalable and Unified Spatial-Frequency Method... |
|
Emerging |
| 51 |
sled-group/CycleNet
[NeurIPS 2023] Official Code for CycleNet: Rethinking Cycle Consistent in... |
|
Emerging |
| 52 |
universome/alis
[ICCV 2021] Aligning Latent and Image Spaces to Connect the Unconnectable |
|
Experimental |
| 53 |
LiyaoJiang1998/RAISE
"RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free... |
|
Experimental |
| 54 |
AIDC-AI/TeEFusion
TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance (ICCV 2025) |
|
Experimental |
| 55 |
YangLing0818/IterComp
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from... |
|
Experimental |
| 56 |
JIA-Lab-research/RIVAL
[NeurIPS 2023 Spotlight] Real-World Image Variation by Aligning Diffusion... |
|
Experimental |
| 57 |
xie-lab-ml/CoRe2
[TPAMI] The official implementation of our paper "CoRe^2: Collect, Reflect... |
|
Experimental |
| 58 |
customdiffusion360/custom-diffusion360
CustomDiffusion360: Customizing Text-to-Image Diffusion with Camera Viewpoint Control |
|
Experimental |
| 59 |
boschresearch/Divide-and-Bind
Official implementation of "Divide & Bind Your Attention for Improved... |
|
Experimental |
| 60 |
tgxs002/align_sd
Better Aligning Text-to-Image Models with Human Preference. ICCV 2023 |
|
Experimental |
| 61 |
yuxin-jiang/Anomagic
[AAAI 2026] The Official Implementation for "Anomagic: Crossmodal... |
|
Experimental |
| 62 |
ChenDarYen/Key-Locked-Rank-One-Editing-for-Text-to-Image-Personalization
An Pytorch implementation of the paper Key-Locked Rank One Editing for... |
|
Experimental |
| 63 |
bytedance-fanqie-ai/MOSAIC
[ICLR 2026]🔥🔥🔥MOSAIC: Multi-Subject Personalized Generation via... |
|
Experimental |
| 64 |
Nikolai10/PerCo
PyTorch implementation of PerCo (Towards Image Compression with Perfect... |
|
Experimental |
| 65 |
ChenWu98/generative-visual-prompt
[NeurIPS 2022] (Amortized) distributional control for pre-trained generative models |
|
Experimental |
| 66 |
VAST-AI-Research/SeqTex
[SIGGRAPH Asia 2025] Official github repo of SeqTex, an end-to-end 3D... |
|
Experimental |
| 67 |
guillaumejs2403/TIME
Text-to-Image Models for Counterfactual Explanations: a black-box approach... |
|
Experimental |
| 68 |
mapooon/Face2Diffusion
[CVPR 2024] Face2Diffusion for Fast and Editable Face Personalization... |
|
Experimental |
| 69 |
joanrod/figure-diffusion
Generating figures from research papers, using textual captions from the paper. |
|
Experimental |
| 70 |
TsingZ0/FedKTL
CVPR 2024 accepted paper, An Upload-Efficient Scheme for Transferring... |
|
Experimental |
| 71 |
kongzhecn/OMG
[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In... |
|
Experimental |
| 72 |
ChenWu98/unified-generative-zoo
[ICCV 2023] https://arxiv.org/abs/2210.05559 |
|
Experimental |
| 73 |
quickgrid/text-to-image-diffusion
Experimental (working!) custom implementation of conditional and... |
|
Experimental |
| 74 |
ewrfcas/LeftRefill
LeftRefill: Filling Right Canvas based on Left Reference through Generalized... |
|
Experimental |
| 75 |
hu-zijing/AsynDM
[ICLR 26] Asynchronous diffusion models allocate individual pixels with... |
|
Experimental |
| 76 |
opendilab/PRG
[ICCV 2025] Pretrained Reversible Generation as Unsupervised Visual... |
|
Experimental |
| 77 |
thecrazymage/CasTex
[WACV 2026] CasTex: Cascaded Text-to-Texture Synthesis via Explicit Texture... |
|
Experimental |
| 78 |
IBM/DiffuseKronA
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized... |
|
Experimental |
| 79 |
SPRIGHT-T2I/SPRIGHT
[ECCV 2024] Official PyTorch implementation of "Getting it Right: Improving... |
|
Experimental |
| 80 |
mofayezi/RobuText
[CVPRW 2023] Official implementation of "Benchmarking Robustness to... |
|
Experimental |
| 81 |
Raghuram-Veeramallu/DiffTransBEV
BEV Representation of an Autonomous car using 6 RGB cameras by making use of... |
|
Experimental |
| 82 |
zelaki/ReDi
[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint... |
|
Experimental |
| 83 |
Nithin-GK/UniteandConquer
[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using... |
|
Experimental |
| 84 |
haoningwu3639/MegaFusion
[WACV 2025] MegaFusion: Extend Diffusion Models towards Higher-resolution... |
|
Experimental |
| 85 |
pyladiesams/personalization-with-text-to-image-diffusion-models-feb2024
Get familiar with different fine-tuning techniques for text-to-image models,... |
|
Experimental |
| 86 |
AIDC-AI/CHATS
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for... |
|
Experimental |
| 87 |
p-lambda/composed_finetuning
Code for the ICML 2021 paper "Composed Fine-Tuning: Freezing Pre-Trained... |
|
Experimental |
| 88 |
dsshim0125/s2p
"S2P: State-conditioned Image Synthesis for Data Augmentation in Offline... |
|
Experimental |
| 89 |
DeepakSridhar/fgdm
[NeurIPS 2024] Factor Graph Diffusion Models for Improved Prompt Alignment,... |
|
Experimental |
| 90 |
Nithin-GK/MaxFusion
[ECCV'24] MaxFusion: Plug & Play multimodal generation in text to image... |
|
Experimental |
| 91 |
rabiulcste/vismin
[NeurIPS24] VisMin: Visual Minimal-Change Understanding |
|
Experimental |
| 92 |
showlab/BoxDiff
[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free... |
|
Experimental |
| 93 |
youweiliang/RichHF
Code for CVPR'24 best paper: Rich Human Feedback for Text-to-Image... |
|
Experimental |
| 94 |
koi953215/NaRCan
[NeurIPS 2024] NaRCan: Natural Refined Canonical Image with Integration of... |
|
Experimental |
| 95 |
JortVincenti/DMoE-VAR
Research code for the Dynamic Mixture-of-Experts in Visual Autoregressive... |
|
Experimental |
| 96 |
wateasca/DiffusionVL
🌟 Translate autoregressive models into cutting-edge diffusion vision... |
|
Experimental |
| 97 |
sooyeon-go/eye_for_an_eye
Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models |
|
Experimental |
| 98 |
alibaba/mm-diff
MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration |
|
Experimental |
| 99 |
Ka1b0/Foresight-Guidance
NeurIPS25 Spotlight | Classifier-free guidance (CFG) can be viewed as... |
|
Experimental |
| 100 |
bytedance/ID-Patch
Official implementation of CVPR 2025 paper "ID-Patch: Robust ID Association... |
|
Experimental |
| 101 |
lxa9867/ControlVAR
This is the official implementation for ControlVAR. |
|
Experimental |
| 102 |
johndpope/Emote-hack
Emote Portrait Alive - using ai to reverse engineer code from white paper.... |
|
Experimental |
| 103 |
byliutao/Cradle2Cane
(NeurIPS 2025) From Cradle to Cane: A Two-Pass Framework for High-Fidelity... |
|
Experimental |
| 104 |
yandex-research/adaptive-diffusion
[CVPR'2024] Adaptive Teacher-Student Collaboration for Text-Conditional... |
|
Experimental |
| 105 |
ConceptBed/evaluations
[AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models |
|
Experimental |
| 106 |
tuananhbui89/Embedding-Adjustment
Mitigating Semantic Collapse in Generative Personalization with Test-Time... |
|
Experimental |
| 107 |
yugwangyeol/Facial-caricature-profile-GIF
[Project] Facial-caricature-profile GIF |
|
Experimental |
| 108 |
YangLing0818/ContextDiff
[ICLR 2024] Contextualized Diffusion Models for Text-Guided Image and Video... |
|
Experimental |
| 109 |
Viresh-R/ml-CCA
Implementation of Fast ml-CCA from the ICCV-2015 work "Multi-Label... |
|
Experimental |
| 110 |
CFGpp-diffusion/CFGpp
Official repository for "CFG++: manifold-constrained classifier free... |
|
Experimental |
| 111 |
muzishen/RCDMs
[AAAI 2025] 🎬RCDMs🎬: Boosting Consistency in Story Visualization with... |
|
Experimental |
| 112 |
hohonu-vicml/DirectedDiffusion
Directed Diffusion: Direct Control of Object Placement through Attention... |
|
Experimental |
| 113 |
RuiqingYoung/EAR
Learning to Expand Images for Efficient Visual Autoregressive Modeling |
|
Experimental |
| 114 |
sungnyun/diffblender
DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models |
|
Experimental |
| 115 |
diaoenmao/Multimodal-Controller-for-Generative-Models
[CVMI 2022] Multimodal Controller for Generative Models |
|
Experimental |
| 116 |
PeterHUistyping/M3ashy
M^3ashy: Multi-Modal Material Synthesis via Hyperdiffusion, AAAI'26 (former... |
|
Experimental |
| 117 |
basiclab/MAD
MAD: Makeup All-in-One with Cross-Domain Diffusion Model |
|
Experimental |
| 118 |
YangLing0818/RealCompo
[NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves... |
|
Experimental |
| 119 |
abyildirim/md-projtex
Text-guided 3D texture generation using training-free multi-diffusion in UV space. |
|
Experimental |
| 120 |
nanlliu/Unsupervised-Compositional-Concepts-Discovery
[ICCV 2023] Unsupervised Compositional Concepts Discovery with Text-to-Image... |
|
Experimental |
| 121 |
ZiyiZhang27/MVC-ZigAL
[CVPR 2026] Code for the paper "Refining Few-Step Text-to-Multiview... |
|
Experimental |
| 122 |
dt-3t/LSRS
Official PyTorch implementation of "LSRS: Latent Scale Rejection Sampling... |
|
Experimental |
| 123 |
wfanyue/DPG-T2I-Personalization
[ECCV 2024] Powerful and Flexible: Personalized Text-to-Image Generation via... |
|
Experimental |
| 124 |
james-oldfield/PoS-subspaces
[NeurIPS'23] Parts of Speech–Grounded Subspaces in Vision-Language Models |
|
Experimental |
| 125 |
agneet42/revision
[ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in... |
|
Experimental |
| 126 |
play-with-HOI-generation/HOIG
[NeurIPS 2022 Spotlight] Hand-Object Interaction Image Generation |
|
Experimental |
| 127 |
X-GenGroup/PaCo-RL
Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for... |
|
Experimental |
| 128 |
SHI-Labs/Diffusion-Driven-Test-Time-Adaptation-via-Synthetic-Domain-Alignment
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via... |
|
Experimental |
| 129 |
quickgrid/paper-implementations
Attempts to implement various deep learning, computer vision papers. |
|
Experimental |
| 130 |
TsinghuaC3I/Efficient-Diffusion-Models
TPAMI 2025 Survey Paper |
|
Experimental |
| 131 |
anhquanpham/iterative-comp-rl-generation
Iterative Compositional Data Generation for Robot Control |
|
Experimental |
| 132 |
jiuntian/OneHOI
[CVPR2026] Official repo for "OneHOI: Unifying Human-Object Interaction... |
|
Experimental |
| 133 |
rese1f/pose2img
pose-driven human natural image generation based on latent diffusion model |
|
Experimental |