PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

/ 100

Verified

Built on the PaddlePaddle framework, the toolkit implements streaming ASR/TTS systems with rule-based Chinese text normalization, polyphone handling, and tone sandhi processing through a dedicated linguistic frontend. It provides production-ready deployment via CLI, REST API server, and WebSocket streaming server interfaces, with pre-trained models optimized for both accuracy and inference speed across multiple languages including English, Mandarin, and Cantonese.

12,556 stars and 3,580 monthly downloads. Actively maintained with 3 commits in the last 30 days. Available on PyPI.

Maintenance 16 / 25

Adoption 18 / 25

Maturity 25 / 25

Community 23 / 25

How are scores calculated?

Stars

12,556

Forks

1,956

Language

Python

License

Apache-2.0

Compare

PaddleSpeech and RapidASR

Related tools

k2-fsa/sherpa

Speech-to-text server framework with next-gen Kaldi

Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

yeyupiaoling/YeAudio

Python的音频工具

zaigie/FunSpeech

开箱即用的本地私有化部署语音服务，快速搭建FunASR与CosyVoice2/3后端

manyeyes/ManySpeech

AI Speech Solutions for Tasks such as ASR, Vocal Extraction, Accompaniment Extraction, Audio...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights