ZET-Speech/ZET-Speech-Demo
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)
This tool helps content creators, educators, or anyone needing to generate speech with specific emotions from text. You provide written text and an audio sample demonstrating the desired emotional tone, and it produces a natural-sounding audio recording of your text, spoken with that emotion. It's designed for professionals who need high-quality, emotionally expressive voiceovers without hiring voice actors for every nuance.
No commits in the last 6 months.
Use this if you need to quickly generate spoken audio that conveys a specific emotional style, based on a short example.
Not ideal if you require precise, syllable-level control over speech elements or if you only need a standard, unemotional voice.
Stars
10
Forks
—
Language
JavaScript
License
—
Category
Last pushed
Mar 09, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ZET-Speech/ZET-Speech-Demo"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
stepfun-ai/Step-Audio-EditX
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
unilight/seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System