CVxTz/music_genre_classification
music genre classification : LSTM vs Transformer
Implements multi-label genre classification on 106k tracks from the Free Music Archive using Mel-spectrogram audio features converted to 128-D sequential vectors. Directly compares GRU and Transformer encoders with matched parameter counts (~700k), evaluating performance via micro-averaged precision-recall curves across 161 imbalanced genre classes. Uses librosa for audio preprocessing with pre-computed NumPy arrays to accelerate training, plus test-time augmentation via multiple sequence crops to improve predictions.
No commits in the last 6 months.
Stars
63
Forks
12
Language
Python
License
MIT
Category
Last pushed
Mar 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CVxTz/music_genre_classification"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
CouncilDataProject/speakerbox
Speakerbox: Fine-tune Audio Transformers for speaker identification.
HHousen/speaker-change-detection
Speaker change detection using SincNet and an LSTM/Transformer
palonso/MAEST
Pre-training, fine-tuning, and inference code with the MAEST models for music analysis applications.
aaronstevenwhite/spectrans
Modular spectral transformer implementations in PyTorch with Fourier, wavelet, and other...
icon-lab/HST
Official implementation of Hierarchical Spectrogram Transformers (HST)