eigenpunk/ComfyUI-audio

some generative audio tools for ComfyUI

/ 100

Emerging

Provides multiple specialized generative audio models (Tacotron2, VALL-E X, Tortoise, MusicGen, AudioGen) as ComfyUI nodes, enabling text-to-speech, text-to-music, and audio continuation workflows. Wraps established research implementations (NVIDIA's Tacotron2, Meta's AudioCraft, community forks) with audio utility nodes for conversion and export. Targets GPU-accelerated inference on CUDA 12.1/11.8, though primarily tested on Linux.

101 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

101

Forks

Language

Python

License

GPL-3.0

Higher-rated alternatives

diodiogod/TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice...

Enemyx-net/VibeVoice-ComfyUI

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling...

wildminder/ComfyUI-VibeVoice

ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio

1038lab/ComfyUI-EdgeTTS

ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft's Edge TTS...

wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

Explore Voice AI Tools

All categories Trending Voice AI directory Insights