wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Implements a tokenizer-free diffusion-based TTS architecture built on MiniCPM-4 that models speech in continuous space rather than discrete tokens, enabling context-aware prosody generation. Includes native LoRA fine-tuning support within ComfyUI for custom voice style training, automatic model management with efficient VRAM offloading, and operates at 6.25Hz token rate for faster synthesis on consumer hardware. Integrates seamlessly with ComfyUI's node workflow system, supporting optional reference audio for voice cloning and compatible with multiple inference backends (CUDA, CPU, MPS, DirectML).
390 stars.
Stars
390
Forks
42
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/wildminder/ComfyUI-VoxCPM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
diodiogod/TTS-Audio-Suite
A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice...
Enemyx-net/VibeVoice-ComfyUI
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling...
wildminder/ComfyUI-VibeVoice
ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio
1038lab/ComfyUI-EdgeTTS
ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft's Edge TTS...
eigenpunk/ComfyUI-audio
some generative audio tools for ComfyUI