oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
Provides real-time streaming speech recognition by continuously processing microphone input through any wav2vec2 model from Hugging Face, with configurable audio devices and per-inference timing metrics. The architecture uses PyAudio for live audio capture and runs inference asynchronously, returning recognized text alongside processing latency and sample duration for performance monitoring.
378 stars. No commits in the last 6 months.
Stars
378
Forks
58
Language
Python
License
MIT
Category
Last pushed
Feb 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/oliverguhr/wav2vec2-live"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
liangstein/Chinese-speech-to-text
Chinese Speech To Text Using Wavenet
louiskirsch/speechT
An opensource speech-to-text software written in tensorflow
Open-Speech-EkStep/vakyansh-models
Open source speech to text models for Indic Languages
Open-Speech-EkStep/vakyansh-wav2vec2-experimentation
Repository containing experimentation platform on how to train, infer on wav2vec2 models.
silversparro/wav2letter.pytorch
A fully convolution-network for speech-to-text, built on pytorch.