wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

/ 100

Emerging

Implements a tokenizer-free diffusion-based TTS architecture built on MiniCPM-4 that models speech in continuous space rather than discrete tokens, enabling context-aware prosody generation. Includes native LoRA fine-tuning support within ComfyUI for custom voice style training, automatic model management with efficient VRAM offloading, and operates at 6.25Hz token rate for faster synthesis on consumer hardware. Integrates seamlessly with ComfyUI's node workflow system, supporting optional reference audio for voice cloning and compatible with multiple inference backends (CUDA, CPU, MPS, DirectML).

390 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 16 / 25

How are scores calculated?

Stars

390

Forks

Language

Python

License

Apache-2.0

Compare

ComfyUI-VoxCPM and TTS-Audio-Suite ComfyUI-VoxCPM and VibeVoice-ComfyUI ComfyUI-VoxCPM and ComfyUI-VibeVoice ComfyUI-VoxCPM and ComfyUI-Maya1_TTS ComfyUI-VoxCPM and ComfyUI-XTTS ComfyUI-VoxCPM and ComfyUI-SparkTTS ComfyUI-VoxCPM and ComfyUI-KugelAudio ComfyUI-VoxCPM and ComfyUI-FishSpeech ComfyUI-VoxCPM and ComfyUI-GPT_SoVITS ComfyUI-VoxCPM and ComfyUI-MegaTTS

Higher-rated alternatives

diodiogod/TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice...

Enemyx-net/VibeVoice-ComfyUI

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling...

wildminder/ComfyUI-VibeVoice

ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio

1038lab/ComfyUI-EdgeTTS

ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft's Edge TTS...

eigenpunk/ComfyUI-audio

some generative audio tools for ComfyUI

Explore Voice AI Tools

All categories Trending Voice AI directory Insights