microsoft/UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

/ 100

Emerging

Unifies self-supervised and supervised learning across multiple speech tasks (ASR, speaker recognition, speech enhancement) through variants like WavLM and UniSpeech-SAT that incorporate speaker-aware pre-training and intermediate layer supervision. Models scale from 960 hours (LibriSpeech) to 94k hours across Libri-Light, GigaSpeech, and VoxPopuli datasets, with multilingual support for English, French, Spanish, and Italian. Fully integrated with HuggingFace for straightforward model loading and fine-tuning on downstream tasks.

479 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

479

Forks

Language

Python

License

—

Higher-rated alternatives

Spr-Aachen/Easy-Voice-Toolkit

A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.

ftyers/commonvoice-utils

Linguistic processing for Common Voice

alphacep/awesome-russian-speech

Russian speech technology links

microsoft/SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

PrzemyslawSwiderski/python-gradle-plugin

Gradle plugin to run Python projects.

Explore Voice AI Tools

All categories Trending Voice AI directory Insights