Speech Synthesis Diffusion Diffusion Models
Diffusion models for speech and audio generation including TTS, voice conversion, singing synthesis, and vocoding. Does NOT include general image diffusion, music generation without speech focus, or non-diffusion audio processing.
There are 50 speech synthesis diffusion models tracked. 2 score above 50 (established tier). The highest-rated is PrunaAI/pruna at 59/100 with 1,142 stars. 1 of the top 10 are actively maintained.
Get all 50 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=speech-synthesis-diffusion&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
PrunaAI/pruna
Pruna is a model optimization framework built for developers, enabling you... |
|
Established |
| 2 |
bytedance/LatentSync
Taming Stable Diffusion for Lip Sync! |
|
Established |
| 3 |
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference. |
|
Emerging |
| 4 |
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio... |
|
Emerging |
| 5 |
Aratako/Irodori-TTS
A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control |
|
Emerging |
| 6 |
sayakpaul/diffusers-torchao
End-to-end recipes for optimizing diffusion models with torchao and... |
|
Emerging |
| 7 |
teticio/audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to... |
|
Emerging |
| 8 |
ivanvovk/WaveGrad
Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch. |
|
Emerging |
| 9 |
Rongjiehuang/ProDiff
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast... |
|
Emerging |
| 10 |
keonlee9420/DiffSinger
PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow... |
|
Emerging |
| 11 |
mazumdarsoumya/TempoSyncDiff
Few-step diffusion for audio-driven talking head generation making diffusion... |
|
Emerging |
| 12 |
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient... |
|
Emerging |
| 13 |
iron-mukakin/Emoji-TTS
Irodori-TTSのフォーク、echo-TTSのwebuiになります。 |
|
Emerging |
| 14 |
yochaiye/LipVoicer
Official Code implementation for the ICLR paper "LipVoicer: Generating... |
|
Emerging |
| 15 |
zhenye234/CoMoSpeech
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via... |
|
Emerging |
| 16 |
segmind/distill-sd
Segmind Distilled diffusion |
|
Emerging |
| 17 |
huggingface/diffusion-fast
Faster generation with text-to-image diffusion models. |
|
Emerging |
| 18 |
junhsss/consistency-models
A Toolkit for OpenAI's Consistency Models. |
|
Emerging |
| 19 |
G-U-N/Phased-Consistency-Model
[NeurIPS 2024] Boosting the performance of consistency models with PCM! |
|
Emerging |
| 20 |
sony/soundctm
Pytorch implementation of SoundCTM |
|
Emerging |
| 21 |
trinhtuanvubk/Diff-VC
Diffusion Model for Voice Conversion |
|
Emerging |
| 22 |
xandergos/sCM-mnist
Unofficial implementation of "Simplifying, Stabilizing & Scaling... |
|
Experimental |
| 23 |
TencentARC/AudioStory
AudioStory: Generating Long-Form Narrative Audio with Large Language Models |
|
Experimental |
| 24 |
FireRedTeam/Target-Driven-Distillation
Consistency Distillation with Target Timestep Selection and Decoupled Guidance |
|
Experimental |
| 25 |
koichi-saito-sony/soundctm_dit_iclr
Pytorch implementation of SoundCTM-DiT |
|
Experimental |
| 26 |
hayeong0/Diff-HierVC
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based... |
|
Experimental |
| 27 |
JiauZhang/binary-latent-diffusion
Implementation of Binary Latent Diffusion |
|
Experimental |
| 28 |
0x7o/DeepMozart
Audio generation using diffusion models |
|
Experimental |
| 29 |
mbreuss/consistency_models_toy_task
Unofficial minimal implementation of consistency models (CM) proposed by... |
|
Experimental |
| 30 |
MirageML/MirageStock
Open-Source Implementations of Multi-Modal Diffusion Models Optimized for... |
|
Experimental |
| 31 |
ashutosh1919/consistency-models
Ready to run PyTorch implementation of Consistency Models: One-Step Image... |
|
Experimental |
| 32 |
OpenGVLab/LORIS
[ICML2023] Long-Term Rhythmic Video Soundtracker |
|
Experimental |
| 33 |
seahore/PPG-GradVC
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis |
|
Experimental |
| 34 |
drakyanerlanggarizkiwardhana/Diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio... |
|
Experimental |
| 35 |
jabir-zheng/TCD
Official Repository of the paper "Trajectory Consistency Distillation" |
|
Experimental |
| 36 |
smsharma/consistency-models
Implementation of Consistency Models (Song et al 2023) for few-step image... |
|
Experimental |
| 37 |
LiangXu123/Robust-One-step-Speech-Enhancement-via-Consistency-Distillation-ROSE-CD-
Robust One-step Speech Enhancement via Consistency Distillation... |
|
Experimental |
| 38 |
Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation |
|
Experimental |
| 39 |
romanycc/Audio-Diffusion
Audio Diffusion |
|
Experimental |
| 40 |
testzer0/GradTTS-unoffical
My unofficial implementation of Grad-TTS (ICML 2021) |
|
Experimental |
| 41 |
AxiumCrisis61/StableSVC
StableSVC: Latent Diffusion Model for Singing Voice Conversion (originally... |
|
Experimental |
| 42 |
Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with... |
|
Experimental |
| 43 |
mbreuss/consistency_trajectory_models_toy_task
Minimal unofficial implementation of Consistency Trajectory models on a 1D toy task. |
|
Experimental |
| 44 |
instill-ai/model-diffusion-dvc
⚗️ Diffusion model repository based on HuggingFace Diffusion 2.1 managed by DVC |
|
Experimental |
| 45 |
jwliao1209/DiffMusic
🎼 DiffMusic: A Training-Free Diffusion Framework for Music Inverse Problem |
|
Experimental |
| 46 |
juanalonso/diffusion-audio
Lista de modelos y aplicaciones basadas en diffusion |
|
Experimental |
| 47 |
slegroux/nimrod
minimal deep learning framework |
|
Experimental |
| 48 |
quickgrid/distill-sd
Experiment with latent diffusion models. |
|
Experimental |
| 49 |
minyoungpark1/Speech-Enhancement
Unofficial implementation of SCP-GAN |
|
Experimental |
| 50 |
7-4-7/BirdGen
Implementation of classifier guided diiffusion model on a procedurally... |
|
Experimental |