HenestrosaDev/audiotext
A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.
Supports three distinct transcription backends—Google Speech-to-Text API, OpenAI's Whisper API, and WhisperX—each with configurable parameters like model size, compute type, and batch size for local processing. Built with Python and PyQt for the desktop UI, it enables batch processing of audio files and directories while offering fine-grained subtitle customization including word-level highlighting and line width constraints. Multilingual support spans 99 languages with optional translation capabilities when using Whisper-based methods.
345 stars. No commits in the last 6 months.
Stars
345
Forks
32
Language
Python
License
—
Category
Last pushed
Oct 15, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/HenestrosaDev/audiotext"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
Jaymon/transcribe
Convert images or audio files to plain text on the command line
cyberofficial/Synthalingua
Synthalingua - Real Time Translation
developers-cosmos/Mimasa
Real time multilingual face translator
lperezmo/real-time-translator
A quick app to translate speech in real time using the Whisper API for transcribing audio,...