modelscope/ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

47
/ 100
Emerging

Supports speech super-resolution (bandwidth extension to 48kHz), audio-visual speaker extraction conditioned on face/gesture/EEG signals, and multi-format audio input (wav, mp3, flac, opus, etc.). The toolkit provides unified inference via a NumPy-array interface for flexible pipeline integration, plus separate training modules with data generation scripts for enhancement, separation, and super-resolution tasks. Integrates with ModelScope and HuggingFace for model distribution and includes SpeechScore, a quality assessment toolkit with intrusive and non-intrusive metrics (PESQ, STOI, DNSMOS, NISQA, DISTILL_MOS).

3,962 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

3,962

Forks

325

Language

Python

License

Apache-2.0

Last pushed

Aug 14, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/modelscope/ClearerVoice-Studio"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.