self-supervised-speech-recognition and wav2vec2-huggingface-demo

These are ecosystem siblings—one is a pretrained model checkpoint while the other is a demonstration application showing how to use wav2vec2 models via Hugging Face's Transformers library for inference.

Maintenance 0/25
Adoption 10/25
Maturity 8/25
Community 24/25
Maintenance 0/25
Adoption 7/25
Maturity 9/25
Community 18/25
Stars: 379
Forks: 116
Downloads:
Commits (30d): 0
Language: Python
License:
Stars: 29
Forks: 14
Downloads:
Commits (30d): 0
Language: Jupyter Notebook
License: Apache-2.0
No License Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About self-supervised-speech-recognition

mailong25/self-supervised-speech-recognition

speech to text with self-supervised learning based on wav2vec 2.0 framework

Builds accurate speech recognition for low-resource languages through a three-stage pipeline: self-supervised pretraining on unlabeled audio, fine-tuning on minimal labeled data (as little as 1 hour), and n-gram language model integration for beam search decoding. Leverages fairseq's wav2vec 2.0 implementation with cross-lingual transfer initialization and optional KenLM decoding, enabling practical deployment via a simple Python API despite training resource requirements (V100 GPUs).

About wav2vec2-huggingface-demo

bhattbhavesh91/wav2vec2-huggingface-demo

Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer

Scores updated daily from GitHub, PyPI, and npm data. How scores work