rishikksh20/VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

/ 100

Emerging

Replaces the original hierarchically-nested discriminator with MelGAN's Full-Band architecture to significantly reduce training time while maintaining audio fidelity, achieving real-time vocoding from mel-spectrograms. Built in PyTorch with support for single-speaker (LJSpeech, KSS) and multi-speaker (VCTK) datasets at 22.05kHz, trainable end-to-end via the provided trainer with TensorBoard integration and inference pipeline for mel-to-audio conversion.

321 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 21 / 25

How are scores calculated?

Stars

321

Forks

Language

Python

License

MIT

Higher-rated alternatives

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

fatchord/WaveRNN

WaveRNN Vocoder + TTS

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

HAKORADev/VODER

Voice Operation and Design Engine with Reproduction capabilities

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights