microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech
Unifies self-supervised and supervised learning across multiple speech tasks (ASR, speaker recognition, speech enhancement) through variants like WavLM and UniSpeech-SAT that incorporate speaker-aware pre-training and intermediate layer supervision. Models scale from 960 hours (LibriSpeech) to 94k hours across Libri-Light, GigaSpeech, and VoxPopuli datasets, with multilingual support for English, French, Spanish, and Italian. Fully integrated with HuggingFace for straightforward model loading and fine-tuning on downstream tasks.
479 stars. No commits in the last 6 months.
Stars
479
Forks
76
Language
Python
License
—
Category
Last pushed
Apr 05, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/microsoft/UniSpeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Spr-Aachen/Easy-Voice-Toolkit
A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
ftyers/commonvoice-utils
Linguistic processing for Common Voice
alphacep/awesome-russian-speech
Russian speech technology links
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects.