declare-lab/tango
A family of diffusion models for text-to-audio generation.
Leverages a frozen Flan-T5 instruction-tuned LLM for text encoding paired with a UNet-based latent diffusion model, enabling realistic audio synthesis across diverse sound categories from minimal training data. Tango 2 adds alignment via Direct Preference Optimization (DPO) on the Audio-Alpaca preference dataset for improved output quality. Integrates with Hugging Face (model hosting, Accelerate for multi-GPU training) and provides inference via simple Python API with configurable sampling steps for quality-speed tradeoffs.
1,233 stars. No commits in the last 6 months.
Stars
1,233
Forks
108
Language
Python
License
—
Category
Last pushed
Jul 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/declare-lab/tango"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ljleb/sd-mecha
Executable State Dict Recipes
SJTU-DENG-Lab/Discrete-Diffusion-Forcing
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
Li-Jinsong/DAEDAL
[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...
SalesforceAIResearch/CoDA
Salesforce AI Research's open diffusion language model
AIDASLab/Awesome-Diffusion-LLM
A comprehensive list of papers about Large-Language-Diffusion-Models.