zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
Curated repository of research papers spanning the speech processing pipeline from foundational signal processing methods (HMMs, temporal classification) through modern deep learning architectures (RNNs, CNNs, diffusion models). Covers emerging applications including text-to-audio generation, music modeling, and confidence estimation alongside traditional tasks. Organized by topic with direct links to paper PDFs, enabling researchers to trace the evolution of techniques from classical approaches to contemporary transformer and diffusion-based systems.
3,119 stars. No commits in the last 6 months.
Stars
3,119
Forks
513
Language
—
License
MIT
Category
Last pushed
Oct 19, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/zzw922cn/awesome-speech-recognition-speech-synthesis-papers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.