shangeth/wavencoder
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.
Provides modular custom layers (SincNet, TDNN, multi-head attention variants) and integrates pretrained wav2vec/wav2vec2 encoders for feature extraction. Includes a composable transform pipeline for on-the-fly augmentation (noise injection, time/frequency masking, reverberation) and a built-in trainer utility for end-to-end classification workflows with PyTorch's standard module interface.
No commits in the last 6 months. Available on PyPI.
Stars
92
Forks
14
Language
Python
License
MIT
Category
Last pushed
Jun 06, 2021
Monthly downloads
49
Commits (30d)
0
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/shangeth/wavencoder"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
fatchord/WaveRNN
WaveRNN Vocoder + TTS
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
seungwonpark/melgan
MelGAN vocoder (compatible with NVIDIA/tacotron2)
rishikksh20/iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier...
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a...