iver56/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Transforms are implemented as `nn.Module` subclasses and support GPU acceleration (CUDA), making them composable directly into neural network architectures with minimal training overhead. The library provides flexible randomization control through `per_batch`, `per_example`, and `per_channel` modes, allowing fine-grained augmentation strategies; most transforms are differentiable to enable end-to-end training. It includes 15+ waveform transforms (pitch shift, filtering, gain modulation, impulse response convolution) and handles batched multichannel audio natively.
1,136 stars.
Stars
1,136
Forks
100
Language
Python
License
MIT
Category
Last pushed
Nov 24, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/iver56/torch-audiomentations"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...
crlandsc/torch-log-wmse
logWMSE, an audio quality metric & loss function with support for digital silence target. Useful...
drethage/speech-denoising-wavenet
A neural network for end-to-end speech denoising
KyungsuKim42/tokensynth
The official implementation of TokenSynth (ICASSP 2025)
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".