teticio/audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.

44
/ 100
Emerging

Converts audio into mel spectrograms for diffusion model training, then reconstructs audio from generated spectrograms. Supports both standard DDPM and latent diffusion approaches via VAE compression, DDIM for faster inference (~50 steps), and conditional generation on text/audio embeddings. Integrates directly with Hugging Face's `diffusers` package and model hub, with pre-trained checkpoints available for music genres and Gradio interfaces for interactive use.

789 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

789

Forks

79

Language

Jupyter Notebook

License

GPL-3.0

Last pushed

Sep 25, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/teticio/audio-diffusion"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.