HAKORADev/VODER

Voice Operation and Design Engine with Reproduction capabilities

/ 100

Emerging

Supports speech-to-text, text-to-speech, voice cloning, and music generation through six processing modes (STT+TTS, TTS, TTS+VC, STS, TTM, TTM+VC) built on Whisper, Qwen3-TTS, Seed-VC, and ACE-Step models. Features a dialogue editor for multi-character audio creation (podcasts, audiobooks) with optional auto-generated background music, plus both GUI and CLI interfaces. Runs entirely offline with no API dependencies, deployable in Google Colab or locally with FFmpeg.

116 stars.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 11 / 25

Community 11 / 25

How are scores calculated?

Stars

116

Forks

Language

Python

License

MIT

Higher-rated alternatives

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

fatchord/WaveRNN

WaveRNN Vocoder + TTS

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights