ComfyUI-VoxCPM and ComfyUI-GPT_SoVITS
Both are ComfyUI custom nodes providing speech synthesis and voice cloning capabilities, making them direct competitors in the "comfyui-tts-nodes" category.
About ComfyUI-VoxCPM
wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Implements a tokenizer-free diffusion-based TTS architecture built on MiniCPM-4 that models speech in continuous space rather than discrete tokens, enabling context-aware prosody generation. Includes native LoRA fine-tuning support within ComfyUI for custom voice style training, automatic model management with efficient VRAM offloading, and operates at 6.25Hz token rate for faster synthesis on consumer hardware. Integrates seamlessly with ComfyUI's node workflow system, supporting optional reference audio for voice cloning and compatible with multiple inference backends (CUDA, CPU, MPS, DirectML).
About ComfyUI-GPT_SoVITS
AIFSH/ComfyUI-GPT_SoVITS
a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now
Integrates GPT-SoVITS voice synthesis into ComfyUI's node-based workflow, supporting multi-speaker inference and fine-tuning via SRT subtitle files for precise speaker control. Automatically downloads pre-trained models from Hugging Face, with ffmpeg as the only external dependency. Enables seamless composition with other ComfyUI nodes for end-to-end audio generation pipelines.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work