amanvirparhar/chaplin
A real-time silent speech recognition tool.
Uses pre-trained visual speech recognition models from the Auto-AVSR project to detect lip movements via webcam, processing video frames locally without external APIs. Integrates Ollama with Qwen LLMs for post-processing raw VSR predictions into coherent text, with MediaPipe for facial landmark detection and real-time keyboard input simulation.
714 stars.
Stars
714
Forks
74
Language
Python
License
MIT
Category
Last pushed
Nov 02, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/amanvirparhar/chaplin"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
meizhong986/WhisperJAV
ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD. Noise-robust for JAV
BryceWG/BiBi-Keyboard
说点啥(BiBi Keyboard):一个基于 Kotlin 的 Android 平台的 LLM 与 ASR 语音输入法键盘应用 An LLM ASR voice input method...
DevEmperor/Dictate
A powerful Whisper AI keyboard for reliable speech transcription
vivekuppal/transcribe
Transcribe is a real time transcription, conversation, Language learning platform. It provides...
sindresorhus/awesome-whisper
🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI