ZET-Speech/ZET-Speech-Demo

ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)

/ 100

Experimental

This tool helps content creators, educators, or anyone needing to generate speech with specific emotions from text. You provide written text and an audio sample demonstrating the desired emotional tone, and it produces a natural-sounding audio recording of your text, spoken with that emotion. It's designed for professionals who need high-quality, emotionally expressive voiceovers without hiring voice actors for every nuance.

No commits in the last 6 months.

Use this if you need to quickly generate spoken audio that conveys a specific emotional style, based on a short example.

Not ideal if you require precise, syllable-level control over speech elements or if you only need a standard, unemotional voice.

content-creation voiceover-production e-learning-audio narration audio-content

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

JavaScript

License

—

Higher-rated alternatives

index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

stepfun-ai/Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing...

lucasnewman/f5-tts-mlx

Implementation of F5-TTS in MLX

unilight/seq2seq-vc

A sequence-to-sequence voice conversion toolkit.

FireRedTeam/FireRedTTS

An Open-Sourced LLM-empowered Foundation TTS System

Explore Voice AI Tools

All categories Trending Voice AI directory Insights