zzw922cn/awesome-speech-recognition-speech-synthesis-papers

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

/ 100

Emerging

Curated repository of research papers spanning the speech processing pipeline from foundational signal processing methods (HMMs, temporal classification) through modern deep learning architectures (RNNs, CNNs, diffusion models). Covers emerging applications including text-to-audio generation, music modeling, and confidence estimation alongside traditional tasks. Organized by topic with direct links to paper PDFs, enabling researchers to trace the evolution of techniques from classical approaches to contemporary transformer and diffusion-based systems.

3,119 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

3,119

Forks

513

Language

—

License

MIT

Related tools

ivcylc/OpenMusic

OpenMusic: SOTA Text-to-music (TTM) Generation

guan-yuan/Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion

A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing...

aidayang/LatentSync-OneClick

免费视频对口型软件LatentSync一键启动整合包

Explore Voice AI Tools

All categories Trending Voice AI directory Insights