Diffusion Language Models LLM Tools

Tools and techniques for training, optimizing, and decoding diffusion-based language models. Includes memory enhancement, length extrapolation, constrained decoding, and inference acceleration for diffusion LLMs. Does NOT include standard autoregressive LLMs, general diffusion models for image generation, or non-diffusion-based language model architectures.

There are 15 diffusion language models tools tracked. The highest-rated is zhenye234/xcodec at 39/100 with 294 stars.

Get all 15 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=diffusion-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 zhenye234/xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec...

39
Emerging
2 zhuhanqing/APOLLO

APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper...

35
Emerging
3 Y-Research-SBU/CSRv2

Official Repository for CSRv2 - ICLR 2026

24
Experimental
4 HITESHLPATEL/Mamba-Papers

Awesome Mamba Papers: A Curated Collection of Research Papers , Tutorials & Blogs

23
Experimental
5 MouxiaoHuang/PPE

[ICLR 2026] Official code of PPE: Positional Preservation Embedding for...

22
Experimental
6 psychofict/llm-effective-context-length

Investigating Why the Effective Context Length of LLMs Falls Short (Based on...

22
Experimental
7 rishikksh20/mamba3-pytorch

Readable implementation of Mamba 3 SSM model

18
Experimental
8 sayhitosandy/Mamba_SSM

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

16
Experimental
9 hrlics/CoPE

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

16
Experimental
10 yophis/decom-renorm-merge

Decom-Renorm-Merge: Merging deep learning models through shared representation space.

16
Experimental
11 chen-hao-chao/mdm-prime-v2

MDM-Prime-v2: Binary Encoding and Index Shuffling Enable Compute-optimal...

16
Experimental
12 aflah02/Partial_RoPE_Analysis

Code accompanying the paper “Fractional Rotation, Full Potential?...

15
Experimental
13 Anri-Lombard/Mamba-SAFE

Generating Molecules with the Mamba architecture

13
Experimental
14 Ghost---Shadow/diff-bleu

A fully vectorized PyTorch implementation of BLEU scores optimized for...

11
Experimental
15 Ghost---Shadow/diff-rouge

A fully vectorized PyTorch implementation of ROUGE scores optimized for...

11
Experimental