RaduBolbo/F5-TTS-Emotional-CFG
Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class conditioning built on F5-TTS
Extends F5-TTS with multi-term classifier-free guidance for explicit emotion conditioning across five emotion classes (Neutral, Happy, Sad, Angry, Surprised), fine-tuned on the ESD dataset. The approach enables independent control over emotion intensity via a separate CFG strength parameter while preserving zero-shot voice cloning capabilities. Provides CLI inference with tunable emotion guidance strength to balance synthesis naturalness against emotion expressiveness.
Stars
30
Forks
5
Language
Python
License
MIT
Category
Last pushed
Mar 03, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/RaduBolbo/F5-TTS-Emotional-CFG"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System