ComfyUI-VoxCPM and ComfyUI-SparkTTS

These tools are competitors, as both provide distinct ComfyUI nodes for text-to-speech generation, with VoxCPM focusing on expressive speech and zero-shot voice cloning, and SparkTTS leveraging LLMs for natural speech.

ComfyUI-VoxCPM
47
Emerging
ComfyUI-SparkTTS
41
Emerging
Maintenance 6/25
Adoption 10/25
Maturity 15/25
Community 16/25
Maintenance 2/25
Adoption 10/25
Maturity 16/25
Community 13/25
Stars: 390
Forks: 42
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 124
Forks: 13
Downloads:
Commits (30d): 0
Language: Python
License: GPL-3.0
No Package No Dependents
Stale 6m No Package No Dependents

About ComfyUI-VoxCPM

wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

Implements a tokenizer-free diffusion-based TTS architecture built on MiniCPM-4 that models speech in continuous space rather than discrete tokens, enabling context-aware prosody generation. Includes native LoRA fine-tuning support within ComfyUI for custom voice style training, automatic model management with efficient VRAM offloading, and operates at 6.25Hz token rate for faster synthesis on consumer hardware. Integrates seamlessly with ComfyUI's node workflow system, supporting optional reference audio for voice cloning and compatible with multiple inference backends (CUDA, CPU, MPS, DirectML).

About ComfyUI-SparkTTS

1038lab/ComfyUI-SparkTTS

ComfyUI-SparkTTS is a custom ComfyUI node implementation of SparkTTS, an advanced text-to-speech system that harnesses the power of large language models (LLMs) to generate highly accurate and natural-sounding speech.

Scores updated daily from GitHub, PyPI, and npm data. How scores work