PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Built on the PaddlePaddle framework, the toolkit implements streaming ASR/TTS systems with rule-based Chinese text normalization, polyphone handling, and tone sandhi processing through a dedicated linguistic frontend. It provides production-ready deployment via CLI, REST API server, and WebSocket streaming server interfaces, with pre-trained models optimized for both accuracy and inference speed across multiple languages including English, Mandarin, and Cantonese.
12,556 stars and 3,580 monthly downloads. Actively maintained with 3 commits in the last 30 days. Available on PyPI.
Stars
12,556
Forks
1,956
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 16, 2026
Monthly downloads
3,580
Commits (30d)
3
Dependencies
50
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/PaddlePaddle/PaddleSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
k2-fsa/sherpa
Speech-to-text server framework with next-gen Kaldi
Picovoice/cheetah
On-device streaming speech-to-text engine powered by deep learning
yeyupiaoling/YeAudio
Python的音频工具
zaigie/FunSpeech
开箱即用的本地私有化部署语音服务,快速搭建FunASR与CosyVoice2/3后端
manyeyes/ManySpeech
AI Speech Solutions for Tasks such as ASR, Vocal Extraction, Accompaniment Extraction, Audio...