ComfyUI-VibeVoice and ComfyUI-VoxCPM

These are complementary TTS tools that serve different use cases—VibeVoice excels at multi-speaker conversational audio while VoxCPM specializes in zero-shot voice cloning—so users might employ both depending on whether they need expressive dialogue generation or speaker-specific voice synthesis.

ComfyUI-VibeVoice

Established

ComfyUI-VoxCPM

Emerging

Maintenance 2/25

Adoption 10/25

Maturity 15/25

Community 23/25

Maintenance 6/25

Adoption 10/25

Maturity 15/25

Community 16/25

Stars: 563

Forks: 105

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stars: 390

Forks: 42

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

Stale 6m No Package No Dependents

No Package No Dependents

About ComfyUI-VibeVoice

wildminder/ComfyUI-VibeVoice

ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio

Integrates Microsoft's VibeVoice model directly into ComfyUI workflows for multi-speaker dialogue generation, supporting voice cloning via reference audio and hybrid zero-shot voice generation. Features 4-bit LLM quantization, multiple attention backends (eager/SDPA/Flash Attention/SageAttention), and automatic model management with configurable diffusion parameters for fine-grained control over speech synthesis.

About ComfyUI-VoxCPM

wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

Implements a tokenizer-free diffusion-based TTS architecture built on MiniCPM-4 that models speech in continuous space rather than discrete tokens, enabling context-aware prosody generation. Includes native LoRA fine-tuning support within ComfyUI for custom voice style training, automatic model management with efficient VRAM offloading, and operates at 6.25Hz token rate for faster synthesis on consumer hardware. Integrates seamlessly with ComfyUI's node workflow system, supporting optional reference audio for voice cloning and compatible with multiple inference backends (CUDA, CPU, MPS, DirectML).

Related comparisons

ComfyUI-VibeVoice and TTS-Audio-Suite ComfyUI-VibeVoice and VibeVoice-ComfyUI ComfyUI-VibeVoice and ComfyUI-SparkTTS ComfyUI-VibeVoice and ComfyUI-KugelAudio ComfyUI-VibeVoice and ComfyUI-MegaTTS ComfyUI-VibeVoice and ComfyUI-VoxCPMTTS

Scores updated daily from GitHub, PyPI, and npm data. How scores work