Diffusion Language Models

There are 27 diffusion language models tracked. The highest-rated is ZHZisZZ/dllm at 49/100 with 2,193 stars. 1 of the top 10 are actively maintained.

Get all 27 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=diffusion-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

49
Emerging
2 EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications...

48
Emerging
3 pengzhangzhi/Open-dLLM

Open diffusion language model for code generation — releasing pretraining,...

44
Emerging
4 THUDM/LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

41
Emerging
5 AIoT-MLSys-Lab/SVD-LLM

[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

39
Emerging
6 jxiw/MambaInLlama

[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and...

38
Emerging
7 datamllab/LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

36
Emerging
8 czg1225/dParallel

[ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs

33
Emerging
9 DAMO-NLP-SG/CLEX

[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models

32
Emerging
10 JinjieNi/MegaDLMs

GPU-optimized framework for training diffusion language models at any scale....

32
Emerging
11 hao-ai-lab/DistCA

Efficient Long-context Language Model Training by Core Attention Disaggregation

30
Emerging
12 fvliang/DART

Official Implementation of DART (DART: Diffusion-Inspired Speculative...

30
Emerging
13 tommyip/mamba2-minimal

Minimal Mamba-2 implementation in PyTorch

30
Emerging
14 HKUDS/SepLLM

[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One...

29
Experimental
15 Ereboas/MagiCodec

A single-layer, streaming codec model providing SOTA audio quality and...

29
Experimental
16 zjunlp/ModelKinship

Exploring Model Kinship for Merging Large Language Models

28
Experimental
17 sail-sg/Attention-Sink

[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical...

28
Experimental
18 SJTU-DENG-Lab/LightningRL

LightningRL: Breaking the Accuracy–Parallelism Trade-off of Block-wise dLLMs...

28
Experimental
19 hao-ai-lab/d3LLM

d3LLM: Ultra-Fast Diffusion LLM 🚀

27
Experimental
20 VITA-Group/Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via...

27
Experimental
21 JarvisPei/MemDLM

MemDLM: Memory-enhanced Diffusion Language Model

27
Experimental
22 uiuctml/Localize-and-Stitch

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

26
Experimental
23 AlgonetLabs/Cable

Context-aware Biases for Length Extrapolation

26
Experimental
24 VITA-Group/TAPE

[ICML'25] "Rethinking Addressing in Language Models via Contextualized...

26
Experimental
25 OpenMOSS/LongLLaDA

[AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

25
Experimental
26 zhiyuanhubj/LongRecipe

LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models

18
Experimental
27 declare-lab/della

DELLA-Merging: Reducing Interference in Model Merging through...

16
Experimental

Comparisons in this category