Neural Vocoder Implementations Voice AI Tools

Tools and models for converting mel-spectrograms or acoustic features into high-fidelity waveforms using neural networks (GANs, diffusion, autoregressive models). Does NOT include end-to-end TTS systems, speech recognition, or general audio processing.

There are 71 neural vocoder implementations tools tracked. The highest-rated is shangeth/wavencoder at 47/100 with 92 stars and 49 monthly downloads.

Get all 71 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=voice-ai&subcategory=neural-vocoder-implementations&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for...

47
Emerging
2 fatchord/WaveRNN

WaveRNN Vocoder + TTS

44
Emerging
3 kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN &...

44
Emerging
4 HAKORADev/VODER

Voice Operation and Design Engine with Reproduction capabilities

43
Emerging
5 seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

42
Emerging
6 rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating...

42
Emerging
7 lucasnewman/best-rq-pytorch

Implementation of BEST-RQ - a model for self-supervised learning of speech...

41
Emerging
8 rishikksh20/VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested...

40
Emerging
9 Deepest-Project/MelNet

Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain"

40
Emerging
10 AmphionTeam/FlexiCodec

[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

40
Emerging
11 tiberiu44/TTS-Cube

End-2-end speech synthesis with recurrent neural networks

40
Emerging
12 npuichigo/waveglow

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network...

39
Emerging
13 34j/neural-source-filter

Python package for NSF and NSF-HiFi-GAN (unofficial)

39
Emerging
14 rishikksh20/Fre-GAN-pytorch

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

38
Emerging
15 rishikksh20/TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for...

37
Emerging
16 yerfor/SyntaSpeech

SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022;...

37
Emerging
17 jishengpeng/WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second...

37
Emerging
18 rishikksh20/melgan

MelGAN implementation with Multi-Band and Full Band supports...

36
Emerging
19 keonlee9420/WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement...

36
Emerging
20 zceng/LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

36
Emerging
21 keonlee9420/PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative...

36
Emerging
22 BogiHsu/WG-WaveNet

Real-Time High-Fidelity Speech Synthesis without GPU

35
Emerging
23 hcy71o/AutoVocoder

Autovocoder: Fast Waveform Generation from a Learned Speech Representation...

35
Emerging
24 modelscope/FunCodec

FunCodec is a research-oriented toolkit for audio quantization and...

34
Emerging
25 tuan3w/cnn_vocoder

A fast cnn-based vocoder

34
Emerging
26 warisqr007/vocos

Causal version of Vocos (neural vocoders for high-quality audio synthesis)...

33
Emerging
27 rishikksh20/Avocodo-pytorch

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

33
Emerging
28 cvqluu/TDNN

Time delay neural network (TDNN) implementation in Pytorch using unfold method

32
Emerging
29 rishikksh20/UnivNet-pytorch

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators...

31
Emerging
30 hhguo/SoCodec

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

30
Emerging
31 andi611/Conditional-SpecGAN-Tensorflow

Text-to-Speech Synthesis by Generating Spectrograms using Generative...

30
Emerging
32 zsl24/Tacotron2-Mandarin-HiFiGAN

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

30
Emerging
33 rishikksh20/iSTFT-Avocodo-pytorch

Ultrafast GAN based Vocoder for Text to Speech

30
Emerging
34 Rongjiehuang/Multiband-WaveRNN

An unofficial implement of autoregressive vocoder Multiband-WaveRNN. Audio...

30
Emerging
35 Fraunhofer-AISEC/towards-resistant-audio-adversarial-examples

Generation tool for offset-resistant audio adversarial examples against Deepspeech

29
Experimental
36 hi-paris/wavlm-vocoder-french

WavLM-to-Audio neural vocoder for French speech reconstruction — layer...

28
Experimental
37 maetshju/flux-blstm-implementation

An implementation of the Graves & Schmidhuber (2005) bidirectional LSTM in Flux.

28
Experimental
38 HarunoriKawano/BEST-RQ

Implementation of the paper "Self-supervised Learning with Random-projection...

28
Experimental
39 WindQAQ/tensorflow-wavenet

Implementation of WaveNet network based on Tensorflow.

28
Experimental
40 philsyn/DiffWave-Vocoder

Pytorch Reimplementation of DiffWave Vocoder: a high quality, fast, and...

27
Experimental
41 candlewill/AiVoice

Deep CNN networks for Speech Synthesis

27
Experimental
42 vliu15/adversarial-tts

End-to-end Text-to-Speech with Generative Adversarial Networks

27
Experimental
43 ryhorv/tf-flowavenet

Tensorflow implementation of "FloWaveNet: A Generative Flow for Raw Audio"

26
Experimental
44 anooptoffy/DLJeju2018CodeRepoASR

Details on my work on using GANs for speech synthesis for improving Speech...

26
Experimental
45 lucadellalib/audiocodecs

A collections of audio codecs with a standardized API

26
Experimental
46 zzw922cn/LPC_for_TTS

Linear Prediction Coefficients estimation from mel-spectrogram implemented...

25
Experimental
47 nilakshdas/ADAGIO

Adversarial Defense for Audio in a Gadget with Interactive Operations

25
Experimental
48 diggerdu/pytorch_audio

audio processing module for pytorch:stft, istft

24
Experimental
49 warisqr007/vq-bnf

Vector Quantizing speech representations

24
Experimental
50 Barbany/Multi-speaker-Neural-Vocoder

Bachelor's thesis carried at Universitat Politecnica de Catalunya in partial...

24
Experimental
51 azraelkuan/FFTNet

FFTNet: a Real-Time Speaker-Dependent Neural Vocoder

24
Experimental
52 rafaelvalle/asrgen

Attacking Speaker Recognition with Deep Generative Models

23
Experimental
53 DillionLowry/NeuralCodecs

Neural Audio Codecs implemented in C# - DAC, SNAC, Encodec, Dia

22
Experimental
54 khaykingleb/hifi-gan

Neural vocoder for high-fidelity speech synthesis (implementation of the...

22
Experimental
55 jik876/hifi-gan-demo

Audio samples from "HiFi-GAN: Generative Adversarial Networks for Efficient...

21
Experimental
56 dimitriStoidis/GenGAN

Repository for the paper: Generating gender-ambiguous voices for...

21
Experimental
57 Xinghui-Wu/KENKU

KENKU: Towards Efficient and Stealthy Black-box Adversarial Attacks against...

20
Experimental
58 p1an-lin-jung/WavThruVec_pytorch

An implementation of Charactr, Inc's "WavThruVec: Latent speech...

20
Experimental
59 rishikksh20/voxtral-codec-pytoch

Voxtral Codec : Combining Semantic VQ and Acoustic FSQ for Ultra-Low Bitrate...

19
Experimental
60 PeechApp/tts-peech

DelightfulTTS with Hifi-GAN and Univnet vocoders

18
Experimental
61 aminul-huq/Adversarial-Examples-For-Audio-Data

Repo for papers to read on adversarial attack and defense techniques in the...

18
Experimental
62 egorsmkv/radtts-hifigan

RADTTS + HiFiGAN vocoder

18
Experimental
63 ZhanpengWang96/pytorch-speech2vec

Pytorch implementation of the paper Speech2Vec: A Sequence-to-Sequence...

17
Experimental
64 Orca0917/Spectrogram-VQ

Unofficial implementation of Spectrogram VQ from DCTTS paper - Vector...

15
Experimental
65 NTT123/hifigan-tpu

Train HiFi-GAN on TPU

14
Experimental
66 diver-j/melgan-multi

MelGAN Multi GPU Implementation.

13
Experimental
67 will-rice/diffwave

TensorFlow 2.0 Implementation of DiffWave: A Versatile Diffusion Model for...

13
Experimental
68 mzyICT/MSDGAN

基于对刚生成网络的语音降噪

12
Experimental
69 che-roman/mb-melgan

Unofficial implementation of Multi-band MelGAN

10
Experimental
70 neyudin/wavenetglow

Main repository for the "Modern Methods of Speech Recognition and Synthesis"...

10
Experimental
71 StellarTerror/NeuralVocoders

Implementations of HiFi-GAN, iSTFTNet and MISRNet

10
Experimental