OpenBMB/VoxCPM
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Employs an end-to-end diffusion autoregressive architecture built on the MiniCPM-4 backbone that directly generates continuous speech representations, avoiding discrete tokenization bottlenecks. Achieves semantic-acoustic decoupling through hierarchical language modeling and FSQ constraints, enabling streaming synthesis at RTF ~0.15 on consumer GPUs. Supports both full-parameter and LoRA fine-tuning via Hugging Face and ModelScope, with optional speech enhancement and ASR integration through ZipEnhancer and SenseVoice.
6,143 stars and 17,933 monthly downloads. Actively maintained with 2 commits in the last 30 days. Available on PyPI.
Stars
6,143
Forks
744
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Monthly downloads
17,933
Commits (30d)
2
Dependencies
20
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/OpenBMB/VoxCPM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
codename0og/codename-rvc-fork-4
Codename's rvc fork version 4, based on Applio.
JackismyShephard/ultimate-rvc
An app for creating audio-based content such as song covers and speech using Retrieval-based...
ArkanDash/Advanced-RVC-Inference
Advanced RVC Inference for quicker and effortless model downloads