atomicoo/FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。
Based on the README, here's a technical summary: Implements a parallel text-to-mel-spectrogram architecture with separate duration prediction and acoustic models, using MelGAN vocoder for waveform generation. Supports multiple training configurations via YAML, includes pre-trained models for LJSpeech, and integrates optional Weights & Biases logging for experiment tracking. Achieves real-time synthesis speeds on CPU/GPU with batch processing capabilities.
281 stars. No commits in the last 6 months.
Stars
281
Forks
47
Language
Python
License
MIT
Category
Last pushed
Mar 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/atomicoo/FCH-TTS"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for...
lucasnewman/nanospeech
A simple, hackable text-to-speech system in PyTorch and MLX
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing,...
jxzhanggg/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech...