Speech Synthesis Diffusion Diffusion Models

Diffusion models for speech and audio generation including TTS, voice conversion, singing synthesis, and vocoding. Does NOT include general image diffusion, music generation without speech focus, or non-diffusion audio processing.

There are 50 speech synthesis diffusion models tracked. 2 score above 50 (established tier). The highest-rated is PrunaAI/pruna at 59/100 with 1,142 stars. 1 of the top 10 are actively maintained.

Get all 50 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=speech-synthesis-diffusion&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	PrunaAI/pruna Pruna is a model optimization framework built for developers, enabling you...	59	Established	1,142	Python
2	bytedance/LatentSync Taming Stable Diffusion for Lip Sync!	51	Established	5,506	Python
3	haoheliu/AudioLDM-training-finetuning AudioLDM training, finetuning, evaluation and inference.	41	Emerging	297	Python
4	Text-to-Audio/Make-An-Audio PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio...	40	Emerging	669	Python
5	Aratako/Irodori-TTS A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control	40	Emerging	40	Python
6	sayakpaul/diffusers-torchao End-to-end recipes for optimizing diffusion models with torchao and...	39	Emerging	397	Python
7	teticio/audio-diffusion Apply diffusion models using the new Hugging Face diffusers package to...	37	Emerging	789	Jupyter Notebook
8	ivanvovk/WaveGrad Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.	37	Emerging	408	Jupyter Notebook
9	Rongjiehuang/ProDiff PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast...	37	Emerging	432	Python
10	keonlee9420/DiffSinger PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow...	37	Emerging	247	Python
11	mazumdarsoumya/TempoSyncDiff Few-step diffusion for audio-driven talking head generation making diffusion...	36	Emerging	2	Python
12	keonlee9420/DiffGAN-TTS PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient...	36	Emerging	347	Python
13	iron-mukakin/Emoji-TTS Irodori-TTSのフォーク、echo-TTSのwebuiになります。	35	Emerging	7	Python
14	yochaiye/LipVoicer Official Code implementation for the ICLR paper "LipVoicer: Generating...	34	Emerging	86	Python
15	zhenye234/CoMoSpeech ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via...	33	Emerging	213	Python
16	segmind/distill-sd Segmind Distilled diffusion	33	Emerging	619	Python
17	huggingface/diffusion-fast Faster generation with text-to-image diffusion models.	32	Emerging	232	Python
18	junhsss/consistency-models A Toolkit for OpenAI's Consistency Models.	30	Emerging	207	Python
19	G-U-N/Phased-Consistency-Model [NeurIPS 2024] Boosting the performance of consistency models with PCM!	30	Emerging	514	Python
20	sony/soundctm Pytorch implementation of SoundCTM	30	Emerging	101	Python
21	trinhtuanvubk/Diff-VC Diffusion Model for Voice Conversion	30	Emerging	69	Jupyter Notebook
22	xandergos/sCM-mnist Unofficial implementation of "Simplifying, Stabilizing & Scaling...	29	Experimental	89	Python
23	TencentARC/AudioStory AudioStory: Generating Long-Form Narrative Audio with Large Language Models	26	Experimental	299	Jupyter Notebook
24	FireRedTeam/Target-Driven-Distillation Consistency Distillation with Target Timestep Selection and Decoupled Guidance	25	Experimental	104	Python
25	koichi-saito-sony/soundctm_dit_iclr Pytorch implementation of SoundCTM-DiT	24	Experimental	4	Jupyter Notebook
26	hayeong0/Diff-HierVC Official Pytorch Implementation of "Diff-HierVC: Diffusion-based...	24	Experimental	235	Python
27	JiauZhang/binary-latent-diffusion Implementation of Binary Latent Diffusion	24	Experimental	51	Python
28	0x7o/DeepMozart Audio generation using diffusion models	24	Experimental	2	Python
29	mbreuss/consistency_models_toy_task Unofficial minimal implementation of consistency models (CM) proposed by...	23	Experimental	21	Python
30	MirageML/MirageStock Open-Source Implementations of Multi-Modal Diffusion Models Optimized for...	23	Experimental	198	Python
31	ashutosh1919/consistency-models Ready to run PyTorch implementation of Consistency Models: One-Step Image...	23	Experimental	6	Shell
32	OpenGVLab/LORIS [ICML2023] Long-Term Rhythmic Video Soundtracker	22	Experimental	62	Python
33	seahore/PPG-GradVC A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis	22	Experimental	44	Python
34	drakyanerlanggarizkiwardhana/Diffusers 🤗 Diffusers: State-of-the-art diffusion models for image and audio...	22	Experimental	1	Python
35	jabir-zheng/TCD Official Repository of the paper "Trajectory Consistency Distillation"	21	Experimental	363	Python
36	smsharma/consistency-models Implementation of Consistency Models (Song et al 2023) for few-step image...	20	Experimental	19	Jupyter Notebook
37	LiangXu123/Robust-One-step-Speech-Enhancement-via-Consistency-Distillation-ROSE-CD- Robust One-step Speech Enhancement via Consistency Distillation...	19	Experimental	10	—
38	Consistency-TTA/consistency-tta.github.io Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation	19	Experimental	7	HTML
39	romanycc/Audio-Diffusion Audio Diffusion	16	Experimental	4	Python
40	testzer0/GradTTS-unoffical My unofficial implementation of Grad-TTS (ICML 2021)	16	Experimental	4	Jupyter Notebook
41	AxiumCrisis61/StableSVC StableSVC: Latent Diffusion Model for Singing Voice Conversion (originally...	16	Experimental	4	Python
42	Bai-YT/ConsistencyTTA ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with...	16	Experimental	39	Python
43	mbreuss/consistency_trajectory_models_toy_task Minimal unofficial implementation of Consistency Trajectory models on a 1D toy task.	15	Experimental	22	Python
44	instill-ai/model-diffusion-dvc ⚗️ Diffusion model repository based on HuggingFace Diffusion 2.1 managed by DVC	15	Experimental	2	Python
45	jwliao1209/DiffMusic 🎼 DiffMusic: A Training-Free Diffusion Framework for Music Inverse Problem	14	Experimental	4	Python
46	juanalonso/diffusion-audio Lista de modelos y aplicaciones basadas en diffusion	13	Experimental	11	—
47	slegroux/nimrod minimal deep learning framework	13	Experimental	2	Jupyter Notebook
48	quickgrid/distill-sd Experiment with latent diffusion models.	12	Experimental	3	Python
49	minyoungpark1/Speech-Enhancement Unofficial implementation of SCP-GAN	12	Experimental	18	Python
50	7-4-7/BirdGen Implementation of classifier guided diiffusion model on a procedurally...	11	Experimental	—	Jupyter Notebook

Comparisons in this category

LatentSync and TempoSyncDiff (51 vs 36) DiffSinger and DiffGAN-TTS (37 vs 36)