Zero-Shot Voice Synthesis Voice AI Tools

Tools for synthesizing speech with zero-shot or few-shot learning, enabling speaker cloning, emotion control, style transfer, and voice conversion without extensive training data. Does NOT include general text-to-speech engines, ASR systems, or non-zero-shot voice synthesis approaches.

There are 43 zero-shot voice synthesis tools tracked. 3 score above 50 (established tier). The highest-rated is index-tts/index-tts at 63/100 with 19,454 stars. 2 of the top 10 are actively maintained.

Get all 43 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=zero-shot-voice-synthesis&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	index-tts/index-tts An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System	63	Established	19,454	Python
2	lucasnewman/f5-tts-mlx Implementation of F5-TTS in MLX	55	Established	611	Python
3	stepfun-ai/Step-Audio-EditX A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model...	50	Established	884	Python
4	unilight/seq2seq-vc A sequence-to-sequence voice conversion toolkit.	46	Emerging	108	Jupyter Notebook
5	JosefAlbers/e2tts-mlx Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX	41	Emerging	29	Python
6	FireRedTeam/FireRedTTS An Open-Sourced LLM-empowered Foundation TTS System	39	Emerging	905	Python
7	RaduBolbo/F5-TTS-Emotional-CFG Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class...	39	Emerging	30	Python
8	ubisoft/ubisoft-laforge-daft-exprt Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis	38	Emerging	129	Python
9	Kyubyong/cross_vc Cross-lingual Voice Conversion	38	Emerging	97	Python
10	Edresson/YourTTS YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion...	38	Emerging	1,052	Jupyter Notebook
11	lucasnewman/f5-tts-swift Implementation of F5-TTS in Swift using MLX	37	Emerging	91	Swift
12	hi-paris/Prosody-Control-French-TTS An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control	37	Emerging	31	Python
13	keonlee9420/Cross-Speaker-Emotion-Transfer PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based...	36	Emerging	194	Python
14	uetuluk/xcodec2-infer-lib CPU support for xcodec2	35	Emerging	6	Python
15	Emotional-Text-to-Speech/hmm-for-emo-tts :computer: A repository with comprehensive instructions for using the...	34	Emerging	50	CSS
16	WangHelin1997/SSR-Speech SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis	34	Emerging	147	Python
17	keonlee9420/Robust_Fine_Grained_Prosody_Control PyTorch Implementation of Robust and fine-grained prosody control of...	33	Emerging	41	Python
18	adelacvg/ttts Train the next generation of TTS systems.	33	Emerging	171	Python
19	lucasnewman/descript-mlx Implementation of the Descript Audio Codec in MLX	33	Emerging	10	Python
20	aiola-lab/drax Drax: Speech Recognition with Discrete Flow Matching	32	Emerging	75	Python
21	hcy71o/SC-CNN SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker...	31	Emerging	39	Python
22	WelkinYang/Learn2Sing2.0 Diffusion and Mutual Information-Based Target Speaker SVS by Learning from...	28	Experimental	181	JavaScript
23	ddlBoJack/MT4SSL [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL:...	26	Experimental	45	Python
24	NN-Project-2/Emotion-TTS-Emebddings This project explores zero-shot emotional speech synthesis using EMOD, a...	25	Experimental	18	Python
25	ictnlp/ComSpeech Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct...	24	Experimental	26	Python
26	rishikksh20/Zero-Shot-TTS Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based...	24	Experimental	34	Python
27	adelacvg/detail_tts All generative model in one for better TTS model	23	Experimental	74	Python
28	lordzuko/cross-text-PT Improving the Appropriateness in Cross-Text Prosody Transfer using Human Supervision	23	Experimental	2	Python
29	CMsmartvoice/Unet-TTS One-shot TTS with Improved Unseen Speaker and Style Transfer	23	Experimental	37	—
30	xuan3986/UDDETTS The first LLM that unifies discrete and dimensional emotions for...	23	Experimental	8	Python
31	zhenye234/FlashSpeech ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis	22	Experimental	155	Python
32	jishengpeng/ControlSpeech [ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker...	22	Experimental	275	Python
33	fmiotello/fastVC A simple voice conversion tool	22	Experimental	20	Python
34	NassimaOULDOUALI/Prosody-Control-French-TTS An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control	21	Experimental	19	Python
35	WelkinYang/EMPHASIS-pytorch EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System	21	Experimental	15	Python
36	ORI-Muchim/Grad-TTS 'Grad-TTS' with Multilingual Cleaners	21	Experimental	11	Jupyter Notebook
37	Rumeysakeskin/Turkish-Text-to-Speech Speech synthesis (TTS) in low-resource languages by training from scratch...	20	Experimental	66	Python
38	jzmzhong/Automatic-Prosody-Annotator-with-SSWP-CLAP An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).	20	Experimental	51	Python
39	MotivationalSpeechSynthesis/motivational-speech-synthesis Artistic research deconstructing the performative excess of motivational...	17	Experimental	2	Python
40	the-bird-F/Expressive-Vectors [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal...	17	Experimental	38	Python
41	adelacvg/DPTTS An AR+AR TTS attempt.	16	Experimental	18	Python
42	Wonbin-Jung/e3-vits Official GitHub page of E3-VITS	14	Experimental	9	HTML
43	wenhuahuo/Cross-Device-Acoustic-Communication-Python-Implementation Digital acoustic communication tools using QFSK and Convolutional Encode. 跨设备声学通信。	14	Experimental	9	Python

Comparisons in this category

f5-tts-mlx and f5-tts-swift (55 vs 37)