Diffusion Language Models
There are 27 diffusion language models tracked. The highest-rated is ZHZisZZ/dllm at 49/100 with 2,193 stars. 1 of the top 10 are actively maintained.
Get all 27 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=diffusion-language-models&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
ZHZisZZ/dllm
dLLM: Simple Diffusion Language Modeling |
|
Emerging |
| 2 |
EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications... |
|
Emerging |
| 3 |
pengzhangzhi/Open-dLLM
Open diffusion language model for code generation — releasing pretraining,... |
|
Emerging |
| 4 |
THUDM/LongWriter
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs |
|
Emerging |
| 5 |
AIoT-MLSys-Lab/SVD-LLM
[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2 |
|
Emerging |
| 6 |
jxiw/MambaInLlama
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and... |
|
Emerging |
| 7 |
datamllab/LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning |
|
Emerging |
| 8 |
czg1225/dParallel
[ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs |
|
Emerging |
| 9 |
DAMO-NLP-SG/CLEX
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models |
|
Emerging |
| 10 |
JinjieNi/MegaDLMs
GPU-optimized framework for training diffusion language models at any scale.... |
|
Emerging |
| 11 |
hao-ai-lab/DistCA
Efficient Long-context Language Model Training by Core Attention Disaggregation |
|
Emerging |
| 12 |
fvliang/DART
Official Implementation of DART (DART: Diffusion-Inspired Speculative... |
|
Emerging |
| 13 |
tommyip/mamba2-minimal
Minimal Mamba-2 implementation in PyTorch |
|
Emerging |
| 14 |
HKUDS/SepLLM
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One... |
|
Experimental |
| 15 |
Ereboas/MagiCodec
A single-layer, streaming codec model providing SOTA audio quality and... |
|
Experimental |
| 16 |
zjunlp/ModelKinship
Exploring Model Kinship for Merging Large Language Models |
|
Experimental |
| 17 |
sail-sg/Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical... |
|
Experimental |
| 18 |
SJTU-DENG-Lab/LightningRL
LightningRL: Breaking the Accuracy–Parallelism Trade-off of Block-wise dLLMs... |
|
Experimental |
| 19 |
hao-ai-lab/d3LLM
d3LLM: Ultra-Fast Diffusion LLM 🚀 |
|
Experimental |
| 20 |
VITA-Group/Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via... |
|
Experimental |
| 21 |
JarvisPei/MemDLM
MemDLM: Memory-enhanced Diffusion Language Model |
|
Experimental |
| 22 |
uiuctml/Localize-and-Stitch
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic |
|
Experimental |
| 23 |
AlgonetLabs/Cable
Context-aware Biases for Length Extrapolation |
|
Experimental |
| 24 |
VITA-Group/TAPE
[ICML'25] "Rethinking Addressing in Language Models via Contextualized... |
|
Experimental |
| 25 |
OpenMOSS/LongLLaDA
[AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs |
|
Experimental |
| 26 |
zhiyuanhubj/LongRecipe
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models |
|
Experimental |
| 27 |
declare-lab/della
DELLA-Merging: Reducing Interference in Model Merging through... |
|
Experimental |