self-supervised-speech-recognition and wav2vec2-huggingface-demo
These are ecosystem siblings—one is a pretrained model checkpoint while the other is a demonstration application showing how to use wav2vec2 models via Hugging Face's Transformers library for inference.
About self-supervised-speech-recognition
mailong25/self-supervised-speech-recognition
speech to text with self-supervised learning based on wav2vec 2.0 framework
Builds accurate speech recognition for low-resource languages through a three-stage pipeline: self-supervised pretraining on unlabeled audio, fine-tuning on minimal labeled data (as little as 1 hour), and n-gram language model integration for beam search decoding. Leverages fairseq's wav2vec 2.0 implementation with cross-lingual transfer initialization and optional KenLM decoding, enabling practical deployment via a simple Python API despite training resource requirements (V100 GPUs).
About wav2vec2-huggingface-demo
bhattbhavesh91/wav2vec2-huggingface-demo
Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer
Scores updated daily from GitHub, PyPI, and npm data. How scores work