pytorch/audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

/ 100

Established

Provides GPU-accelerated audio transforms (spectrograms, MelSpectrograms, MFCC) and speech processing functions like forced alignment, all implemented as differentiable PyTorch operations for end-to-end training. Includes compliance interfaces that replicate Kaldi feature extraction, enabling seamless migration from traditional speech processing frameworks while maintaining gradient flow through the audio pipeline.

2,838 stars. Actively maintained with 1 commit in the last 30 days.

No Package No Dependents

Maintenance 16 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

2,838

Forks

764

Language

Python

License

BSD-2-Clause

Related frameworks

deezer/spleeter

Deezer source separation library including pretrained models.

asteroid-team/asteroid

The PyTorch-based audio source separation toolkit for researchers

drscotthawley/aeiou

(ML) audio engineering i/o utils

audeering/opensmile

The Munich Open-Source Large-Scale Multimedia Feature Extractor

mindspore-lab/mindaudio

A toolbox of audio models and algorithms based on MindSpore

Explore ML Frameworks

All categories Trending ML Framework directory Insights