declare-lab/TangoFlux
[ICLR 2026] TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Uses Diffusion Transformers (DiT/MMDiT) conditioned on text and duration embeddings with rectified flow matching to learn trajectories in a VAE-compressed latent space. The three-stage training pipeline incorporates CRPO (Clap-Ranked Preference Optimization), which iteratively synthesizes preference pairs and applies DPO loss to align generated audio with human preferences. Integrates with Hugging Face (model hosting and accelerate training framework), ComfyUI for node-based workflows, and provides Python API, CLI, and web interface access.
843 stars.
Stars
843
Forks
76
Language
Jupyter Notebook
License
—
Category
Last pushed
Jan 28, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/declare-lab/TangoFlux"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
asigalov61/tegridy-tools
Symbolic Music NLP Artificial Intelligence Toolkit
jaschadub/harmonydagger
Make Music Unlearnable for Generative AI.
kyegomez/MORPHEUS-1
Implementation of "MORPHEUS-1" from Prophetic AI and "The world’s first multi-modal generative...
salu133445/musegan
An AI for Music Generation
FORARTfe/HyMPS
HyMPS will be a platform-indipendent software suite for advanced audio/video contents production.