habla-liaa/ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
Leverages pretrained Wav2vec 2.0 encodings as fixed speech representations, combining encoder outputs with transformer layer embeddings through fusion strategies to extract emotion-relevant features. Evaluated on RAVDESS and IEMOCAP datasets using 5-fold cross-validation with multiple random seeds, providing pretrained checkpoints for both datasets alongside a reproducible experimental pipeline via YAML configuration and shell scripts.
140 stars. No commits in the last 6 months.
Stars
140
Forks
25
Language
Jupyter Notebook
License
—
Category
Last pushed
Jan 06, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/habla-liaa/ser-with-w2v2"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
salute-developers/GigaAM
Foundational Model for Speech Recognition Tasks
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
AkishinoShiame/Chinese-Speech-Emotion-Datasets
Datasets of A Deep Convolutional Neural Network Based Virtual Elderly Companion Agent.
NotAbhinavGamerz/emotion-aware-automatic-speech-recognition
🎤 Enhance speech recognition by detecting emotions in spoken language, combining OpenAI's...
jsugg/ser
The AI-powered ser Python package is a tool for recognizing and analyzing emotions in speech....