pytorch/audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
Provides GPU-accelerated audio transforms (spectrograms, MelSpectrograms, MFCC) and speech processing functions like forced alignment, all implemented as differentiable PyTorch operations for end-to-end training. Includes compliance interfaces that replicate Kaldi feature extraction, enabling seamless migration from traditional speech processing frameworks while maintaining gradient flow through the audio pipeline.
2,838 stars. Actively maintained with 1 commit in the last 30 days.
Stars
2,838
Forks
764
Language
Python
License
BSD-2-Clause
Category
Last pushed
Mar 13, 2026
Commits (30d)
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/pytorch/audio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
deezer/spleeter
Deezer source separation library including pretrained models.
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
drscotthawley/aeiou
(ML) audio engineering i/o utils
audeering/opensmile
The Munich Open-Source Large-Scale Multimedia Feature Extractor
mindspore-lab/mindaudio
A toolbox of audio models and algorithms based on MindSpore