habla-liaa/ser-with-w2v2

Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'

/ 100

Emerging

Leverages pretrained Wav2vec 2.0 encodings as fixed speech representations, combining encoder outputs with transformer layer embeddings through fusion strategies to extract emotion-relevant features. Evaluated on RAVDESS and IEMOCAP datasets using 5-fold cross-validation with multiple random seeds, providing pretrained checkpoints for both datasets alongside a reproducible experimental pipeline via YAML configuration and shell scripts.

140 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 19 / 25

How are scores calculated?

Stars

140

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

salute-developers/GigaAM

Foundational Model for Speech Recognition Tasks

SuyashMore/MevonAI-Speech-Emotion-Recognition

Identify the emotion of multiple speakers in an Audio Segment

AkishinoShiame/Chinese-Speech-Emotion-Datasets

Datasets of A Deep Convolutional Neural Network Based Virtual Elderly Companion Agent.

NotAbhinavGamerz/emotion-aware-automatic-speech-recognition

🎤 Enhance speech recognition by detecting emotions in spoken language, combining OpenAI's...

jsugg/ser

The AI-powered ser Python package is a tool for recognizing and analyzing emotions in speech....

Explore Voice AI Tools

All categories Trending Voice AI directory Insights