declare-lab/tango

A family of diffusion models for text-to-audio generation.

46
/ 100
Emerging

Leverages a frozen Flan-T5 instruction-tuned LLM for text encoding paired with a UNet-based latent diffusion model, enabling realistic audio synthesis across diverse sound categories from minimal training data. Tango 2 adds alignment via Direct Preference Optimization (DPO) on the Audio-Alpaca preference dataset for improved output quality. Integrates with Hugging Face (model hosting, Accelerate for multi-GPU training) and provides inference via simple Python API with configurable sampling steps for quality-speed tradeoffs.

1,233 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

1,233

Forks

108

Language

Python

License

Last pushed

Jul 29, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/declare-lab/tango"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.