DmitryRyumin/OpenAV
An open-source library for recognition of speech commands in the user dictionary using audiovisual data of the speaker
This helps professionals in noisy environments control systems with their voice, even when their speech might be unclear. It takes in audio and video of a speaker and recognizes specific speech commands from a user-defined dictionary. Anyone operating machinery, managing inventory, or interacting with smart devices in challenging sound conditions would find this useful.
No commits in the last 6 months. Available on PyPI.
Use this if you need robust voice command recognition in loud settings where visual cues of the speaker can improve accuracy.
Not ideal if you only need audio-based speech recognition or if you cannot provide concurrent video data of the speaker.
Stars
7
Forks
3
Language
Python
License
MIT
Category
Last pushed
Feb 28, 2025
Commits (30d)
0
Dependencies
29
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/DmitryRyumin/OpenAV"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Picovoice/porcupine
On-device wake word detection powered by deep learning
MycroftAI/mycroft-precise
A lightweight, simple-to-use, RNN wake word listener
arcosoph/nanowakeword
A lightweight, open-source, and intelligent wake word detection engine. Train custom,...
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run...
OAID/cortex-m-kws
Cortex M KWS example with Tengine Lite.