Neural Vocoder Implementations Voice AI Tools
Tools and models for converting mel-spectrograms or acoustic features into high-fidelity waveforms using neural networks (GANs, diffusion, autoregressive models). Does NOT include end-to-end TTS systems, speech recognition, or general audio processing.
There are 71 neural vocoder implementations tools tracked. The highest-rated is shangeth/wavencoder at 47/100 with 92 stars and 49 monthly downloads.
Get all 71 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=neural-vocoder-implementations&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for... |
|
Emerging |
| 2 |
fatchord/WaveRNN
WaveRNN Vocoder + TTS |
|
Emerging |
| 3 |
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN &... |
|
Emerging |
| 4 |
HAKORADev/VODER
Voice Operation and Design Engine with Reproduction capabilities |
|
Emerging |
| 5 |
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2) |
|
Emerging |
| 6 |
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating... |
|
Emerging |
| 7 |
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech... |
|
Emerging |
| 8 |
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested... |
|
Emerging |
| 9 |
Deepest-Project/MelNet
Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain" |
|
Emerging |
| 10 |
AmphionTeam/FlexiCodec
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates |
|
Emerging |
| 11 |
tiberiu44/TTS-Cube
End-2-end speech synthesis with recurrent neural networks |
|
Emerging |
| 12 |
npuichigo/waveglow
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network... |
|
Emerging |
| 13 |
34j/neural-source-filter
Python package for NSF and NSF-HiFi-GAN (unofficial) |
|
Emerging |
| 14 |
rishikksh20/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis |
|
Emerging |
| 15 |
rishikksh20/TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for... |
|
Emerging |
| 16 |
yerfor/SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022;... |
|
Emerging |
| 17 |
jishengpeng/WavTokenizer
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second... |
|
Emerging |
| 18 |
rishikksh20/melgan
MelGAN implementation with Multi-Band and Full Band supports... |
|
Emerging |
| 19 |
keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement... |
|
Emerging |
| 20 |
zceng/LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation |
|
Emerging |
| 21 |
keonlee9420/PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative... |
|
Emerging |
| 22 |
BogiHsu/WG-WaveNet
Real-Time High-Fidelity Speech Synthesis without GPU |
|
Emerging |
| 23 |
hcy71o/AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation... |
|
Emerging |
| 24 |
modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and... |
|
Emerging |
| 25 |
tuan3w/cnn_vocoder
A fast cnn-based vocoder |
|
Emerging |
| 26 |
warisqr007/vocos
Causal version of Vocos (neural vocoders for high-quality audio synthesis)... |
|
Emerging |
| 27 |
rishikksh20/Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder |
|
Emerging |
| 28 |
cvqluu/TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method |
|
Emerging |
| 29 |
rishikksh20/UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators... |
|
Emerging |
| 30 |
hhguo/SoCodec
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications |
|
Emerging |
| 31 |
andi611/Conditional-SpecGAN-Tensorflow
Text-to-Speech Synthesis by Generating Spectrograms using Generative... |
|
Emerging |
| 32 |
zsl24/Tacotron2-Mandarin-HiFiGAN
Implementation of TTS with combination of Tacotron2 and HiFi-GAN |
|
Emerging |
| 33 |
rishikksh20/iSTFT-Avocodo-pytorch
Ultrafast GAN based Vocoder for Text to Speech |
|
Emerging |
| 34 |
Rongjiehuang/Multiband-WaveRNN
An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio... |
|
Emerging |
| 35 |
Fraunhofer-AISEC/towards-resistant-audio-adversarial-examples
Generation tool for offset-resistant audio adversarial examples against Deepspeech |
|
Experimental |
| 36 |
hi-paris/wavlm-vocoder-french
WavLM-to-Audio neural vocoder for French speech reconstruction — layer... |
|
Experimental |
| 37 |
maetshju/flux-blstm-implementation
An implementation of the Graves & Schmidhuber (2005) bidirectional LSTM in Flux. |
|
Experimental |
| 38 |
HarunoriKawano/BEST-RQ
Implementation of the paper "Self-supervised Learning with Random-projection... |
|
Experimental |
| 39 |
WindQAQ/tensorflow-wavenet
Implementation of WaveNet network based on Tensorflow. |
|
Experimental |
| 40 |
philsyn/DiffWave-Vocoder
Pytorch Reimplementation of DiffWave Vocoder: a high quality, fast, and... |
|
Experimental |
| 41 |
candlewill/AiVoice
Deep CNN networks for Speech Synthesis |
|
Experimental |
| 42 |
vliu15/adversarial-tts
End-to-end Text-to-Speech with Generative Adversarial Networks |
|
Experimental |
| 43 |
ryhorv/tf-flowavenet
Tensorflow implementation of "FloWaveNet: A Generative Flow for Raw Audio" |
|
Experimental |
| 44 |
anooptoffy/DLJeju2018CodeRepoASR
Details on my work on using GANs for speech synthesis for improving Speech... |
|
Experimental |
| 45 |
lucadellalib/audiocodecs
A collections of audio codecs with a standardized API |
|
Experimental |
| 46 |
zzw922cn/LPC_for_TTS
Linear Prediction Coefficients estimation from mel-spectrogram implemented... |
|
Experimental |
| 47 |
nilakshdas/ADAGIO
Adversarial Defense for Audio in a Gadget with Interactive Operations |
|
Experimental |
| 48 |
diggerdu/pytorch_audio
audio processing module for pytorch:stft, istft |
|
Experimental |
| 49 |
warisqr007/vq-bnf
Vector Quantizing speech representations |
|
Experimental |
| 50 |
Barbany/Multi-speaker-Neural-Vocoder
Bachelor's thesis carried at Universitat Politecnica de Catalunya in partial... |
|
Experimental |
| 51 |
azraelkuan/FFTNet
FFTNet: a Real-Time Speaker-Dependent Neural Vocoder |
|
Experimental |
| 52 |
rafaelvalle/asrgen
Attacking Speaker Recognition with Deep Generative Models |
|
Experimental |
| 53 |
DillionLowry/NeuralCodecs
Neural Audio Codecs implemented in C# - DAC, SNAC, Encodec, Dia |
|
Experimental |
| 54 |
khaykingleb/hifi-gan
Neural vocoder for high-fidelity speech synthesis (implementation of the... |
|
Experimental |
| 55 |
jik876/hifi-gan-demo
Audio samples from "HiFi-GAN: Generative Adversarial Networks for Efficient... |
|
Experimental |
| 56 |
dimitriStoidis/GenGAN
Repository for the paper: Generating gender-ambiguous voices for... |
|
Experimental |
| 57 |
Xinghui-Wu/KENKU
KENKU: Towards Efficient and Stealthy Black-box Adversarial Attacks against... |
|
Experimental |
| 58 |
p1an-lin-jung/WavThruVec_pytorch
An implementation of Charactr, Inc's "WavThruVec: Latent speech... |
|
Experimental |
| 59 |
rishikksh20/voxtral-codec-pytoch
Voxtral Codec : Combining Semantic VQ and Acoustic FSQ for Ultra-Low Bitrate... |
|
Experimental |
| 60 |
PeechApp/tts-peech
DelightfulTTS with Hifi-GAN and Univnet vocoders |
|
Experimental |
| 61 |
aminul-huq/Adversarial-Examples-For-Audio-Data
Repo for papers to read on adversarial attack and defense techniques in the... |
|
Experimental |
| 62 |
egorsmkv/radtts-hifigan
RADTTS + HiFiGAN vocoder |
|
Experimental |
| 63 |
ZhanpengWang96/pytorch-speech2vec
Pytorch implementation of the paper Speech2Vec: A Sequence-to-Sequence... |
|
Experimental |
| 64 |
Orca0917/Spectrogram-VQ
Unofficial implementation of Spectrogram VQ from DCTTS paper - Vector... |
|
Experimental |
| 65 |
NTT123/hifigan-tpu
Train HiFi-GAN on TPU |
|
Experimental |
| 66 |
diver-j/melgan-multi
MelGAN Multi GPU Implementation. |
|
Experimental |
| 67 |
will-rice/diffwave
TensorFlow 2.0 Implementation of DiffWave: A Versatile Diffusion Model for... |
|
Experimental |
| 68 |
mzyICT/MSDGAN
基于对刚生成网络的语音降噪 |
|
Experimental |
| 69 |
che-roman/mb-melgan
Unofficial implementation of Multi-band MelGAN |
|
Experimental |
| 70 |
neyudin/wavenetglow
Main repository for the "Modern Methods of Speech Recognition and Synthesis"... |
|
Experimental |
| 71 |
StellarTerror/NeuralVocoders
Implementations of HiFi-GAN, iSTFTNet and MISRNet |
|
Experimental |