CVxTz/music_genre_classification

music genre classification : LSTM vs Transformer

/ 100

Emerging

Implements multi-label genre classification on 106k tracks from the Free Music Archive using Mel-spectrogram audio features converted to 128-D sequential vectors. Directly compares GRU and Transformer encoders with matched parameter counts (~700k), evaluating performance via micro-averaged precision-recall curves across 161 imbalanced genre classes. Uses librosa for audio preprocessing with pre-computed NumPy arrays to accelerate training, plus test-time augmentation via multiple sequence crops to improve predictions.

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 9 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

CouncilDataProject/speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.

HHousen/speaker-change-detection

Speaker change detection using SincNet and an LSTM/Transformer

palonso/MAEST

Pre-training, fine-tuning, and inference code with the MAEST models for music analysis applications.

aaronstevenwhite/spectrans

Modular spectral transformer implementations in PyTorch with Fourier, wavelet, and other...

icon-lab/HST

Official implementation of Hierarchical Spectrogram Transformers (HST)

Explore Transformer Models

All categories Trending Transformer directory Insights