iver56/torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

/ 100

Established

Transforms are implemented as `nn.Module` subclasses and support GPU acceleration (CUDA), making them composable directly into neural network architectures with minimal training overhead. The library provides flexible randomization control through `per_batch`, `per_example`, and `per_channel` modes, allowing fine-grained augmentation strategies; most transforms are differentiable to enable end-to-end training. It includes 15+ waveform transforms (pitch shift, filtering, gain modulation, impulse response convolution) and handles batched multichannel audio natively.

1,136 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

1,136

Forks

100

Language

Python

License

MIT

Related tools

descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz...

crlandsc/torch-log-wmse

logWMSE, an audio quality metric & loss function with support for digital silence target. Useful...

drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

KyungsuKim42/tokensynth

The official implementation of TokenSynth (ICASSP 2025)

YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Explore Voice AI Tools

All categories Trending Voice AI directory Insights