All Voice AI Tools
6,981 tools ranked by quality score · Page 44 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 4301 |
TicooLiu/HowTo-ASR
开源语音识别自定义数据模型训练指南 |
|
Experimental |
| 4302 |
kadirpili/text-to-video-bot-python
Script that generates TikTok style videos using ffmpeg, moviepy, chatGPT,... |
|
Experimental |
| 4303 |
stefanpantic/asr
Automatic speech recognition using neural networks |
|
Experimental |
| 4304 |
jonbrennecke/CaptionThis
"Caption This" is an iOS app that adds real-time captions to videos for... |
|
Experimental |
| 4305 |
pushkar009/Smart-Room-Assistant
This is repository containing Mega project code for Smart Room Assistant. |
|
Experimental |
| 4306 |
Konstantinos123456789/JARVIS_AI
A modular Python AI Assistant (Jarvis) featuring Knowledge Graphs... |
|
Experimental |
| 4307 |
loganngarcia/chaplin-ui
Web interface for a real-time silent speech recognition tool. |
|
Experimental |
| 4308 |
ali7919/Talk-With-LLM-In-Unity
Speech Recognition + LLM inference on device in Unity |
|
Experimental |
| 4309 |
uberduck-ai/openduck
Building an open-source interactive AI plush toy. |
|
Experimental |
| 4310 |
SakshiRathi77/hindiSpeechPro-Automatic-Speech-Recognization
The project,being part of Kagglex BIPOC Mentorship Program final project,... |
|
Experimental |
| 4311 |
iGerman00/buttercup-chrome
A Chrome(ium) extension to replace YouTube's auto-captions with... |
|
Experimental |
| 4312 |
bluenekozkm/moe-tts-webui
The better web ui for MOE-TTS |
|
Experimental |
| 4313 |
deepgram-starters/deno-text-to-speech
Get started using Deepgram's Text-to-Speech with this Deno demo app |
|
Experimental |
| 4314 |
incubated-geek-cc/Text-To-Speech-App
A Fusion of OCR Technology (Tesseract.js) & Web Speech API. Standalone,... |
|
Experimental |
| 4315 |
dingdangdog/VwordAi
VwordAi 是一款文本转语音工具,支持多种语音服务提供商,让您轻松将文本转为自然流畅的语音。 |
|
Experimental |
| 4316 |
RhythmusByte/Sign-Language-to-Speech
Real-time ASL interpreter using OpenCV and TensorFlow/Keras for hand gesture... |
|
Experimental |
| 4317 |
selmetwa/AnkiTTS
Add audio to your Anki deck by leveraging eleven labs text-to-speech API |
|
Experimental |
| 4318 |
DarkOracle10/Video-to-Persian-Translator---Professional-AI-Translation-Pipeline
Professional-AI-Translation-Pipeline |
|
Experimental |
| 4319 |
mathquis/node-kaldi-online-nnet3-decoder
ASR online decoding using Kaldi NNet3 GrammarFST |
|
Experimental |
| 4320 |
miaogang1982/mod_ali
FreeSwitch扩展模块,实现基于阿里云的语音合成功能 |
|
Experimental |
| 4321 |
smswg/FreeSwitch-Mod_Asr
FreeSWITCH阿里云Mod_ASR模块直连阿里云Asr大模型,全网2026年阿里云最新c++Sdk3.2研发,经过大量生产环境测试稳定。可用于AI智... |
|
Experimental |
| 4322 |
Agrover112/Goodness-of-Pronunciation-Pipelines-for-OOV-Problem
Goodness of Pronunciation Pipelines for OOV Removal |
|
Experimental |
| 4323 |
bagustris/id
Iban-based Kaldi recipe for Indonesian speech Corpus, presented at ASJ Spring 2019. |
|
Experimental |
| 4324 |
CaesiumY/dding-dong
Claude Code notification plugin — Sound alerts & OS notifications on task... |
|
Experimental |
| 4325 |
mrhallonline/WhisperXTranscription4Researchers
This repository contains a Jupyter notebook for qualitative researchers to... |
|
Experimental |
| 4326 |
speechnotes/speechnotes-website
New (2023) Doks (hugo + npm) based website for speechnotes.co |
|
Experimental |
| 4327 |
dnkilic/android-sesli-haber
DEPRECATED - This application is created by a group of student who finished... |
|
Experimental |
| 4328 |
Cinnamon/whisper-jargon
[SIGDIAL'24] Improving Speech Recognition with Jargon Injection |
|
Experimental |
| 4329 |
innovate-invent/ChatStream
Multiplatform OBS Chat overlay |
|
Experimental |
| 4330 |
RazEini/e_commerce_shop
Android E-Commerce App with Firebase Realtime DB, Authentication, Smart... |
|
Experimental |
| 4331 |
Philipelima/video-translate
Have you ever thought about translate a YouTube video? That is the idea for... |
|
Experimental |
| 4332 |
JarbasAl/pocketsphinx-models-mirror
pocketsphinx models for languages originating from the iberian peninsula |
|
Experimental |
| 4333 |
takeoutfm/takeout_assistant
Offline voice assistant for Android |
|
Experimental |
| 4334 |
wq2012/mdeval
Python implementation of the NIST md-eval.pl script for evaluating rich... |
|
Experimental |
| 4335 |
light12222/Voice2Sub-Whisper-Live-Translator
Real-time speech-to-text, subtitle overlay, and translation tool. Powered by... |
|
Experimental |
| 4336 |
Maxborland/mindtype-app
MindType — Voice-to-text with AI-powered summaries. 100+ languages, works... |
|
Experimental |
| 4337 |
dangvansam/phoneme2grapheme-vietnamese
convert phoneme to grapheme vietnames |
|
Experimental |
| 4338 |
rafaotetra/awesome-coding-by-voice
A list of videos, papers, tools, APIs and projects about coding by voice |
|
Experimental |
| 4339 |
ab-smith/kokoro-tts-webui
Gradio-based web ui for Kokoro to simplify its usage with multiple voices,... |
|
Experimental |
| 4340 |
profdilley/markdown-speech-converter
This tool converts Markdown files into **speech-friendly plain text** files.... |
|
Experimental |
| 4341 |
Ziggx5/TalkToText
Speech-to-text app bulit with Python and Vosk speech recognition engine |
|
Experimental |
| 4342 |
fr0stb1rd/Edge-TTS-Subtitle-Dubbing
High-performance SRT to Audio Dubbing tool using Microsoft Edge TTS with... |
|
Experimental |
| 4343 |
bionicop/TalkativeSubs
Bring your subtitles to life with TalkativeSubs, a tool that converts SRT... |
|
Experimental |
| 4344 |
HungerCoder01/jarvis-voice-assistant
A Python-based voice assistant built while learning speech recognition,... |
|
Experimental |
| 4345 |
qwertypool/Python-Personal-Desktop-Assistant
A personal assistant which automate your tasks such as search videos in... |
|
Experimental |
| 4346 |
RamanSharma100/Reactjs-voice-controllable-website
this is the voice controllable website using React Js and youtube API |
|
Experimental |
| 4347 |
miguelangelnieto/DNN-Speech-Recognizer
Built a deep neural network that functions as part of an end-to-end... |
|
Experimental |
| 4348 |
Kaljurand/speech-trigger
Android Speech Recognizer service based on... |
|
Experimental |
| 4349 |
nasrul21/kunci-tts-api
API untuk mendapatkan kunci jawaban TTS (Teka Teki Silang) Indonesia |
|
Experimental |
| 4350 |
gastonmorixe/elevenlabs-reader-cli
Unofficial ElevenLabs Reader CLI: create, stream, and play TTS with live karaoke |
|
Experimental |
| 4351 |
row-engineering/ai-narration
A WordPress plugin that converts your posts into audio narrations using AI... |
|
Experimental |
| 4352 |
RiccardoGrin/TerminalWhisper
Voice-to-text for Windows using OpenAI Whisper. Hold a hotkey, speak, text appears. |
|
Experimental |
| 4353 |
Shuichi346/qwen-voice-clone-webui
A Gradio WebUI for voice cloning powered by Qwen3-TTS. Provide reference... |
|
Experimental |
| 4354 |
ivallesp/Xception1d
Xception1d implementation for audio categorization |
|
Experimental |
| 4355 |
SUBHADIPMAITI-DEV/Speech-Recognition-Alexa
A simple Python script that uses various libraries for speech recognition... |
|
Experimental |
| 4356 |
lucaslattari/IAGiroDeNoticias
Repositório do projeto apresentado no vídeo "Bot que cria podcast sozinho??... |
|
Experimental |
| 4357 |
rwightman/tensorflow-speech_commands
Speech commands training/models from TF repo adapted for speech commands Kaggle |
|
Experimental |
| 4358 |
useviolet/violetaudio
Voice AI infrastructure and audio processing toolkit |
|
Experimental |
| 4359 |
Yangyangii/AdvDCTTS
Implementation of DCTTS with Adversarial Training |
|
Experimental |
| 4360 |
terry-yip/speech-to-text
Speaker diarization and speech to text |
|
Experimental |
| 4361 |
codersinthestorm/RecurrentNN_SpeechRecognition
A model based in Tensorflow to recognize words from the 30 word Speech... |
|
Experimental |
| 4362 |
mp-web3/jarvis-v3
Fully local voice interface for Claude Code on Apple Silicon. Parakeet STT +... |
|
Experimental |
| 4363 |
GhostNaN/silero-webui
Silero TTS web UI |
|
Experimental |
| 4364 |
zssloth/TF-Speech-Recognition
Speech Recognition Using Tensorflow |
|
Experimental |
| 4365 |
AlexKly/Simple-Voice-Activity-Detector-using-MFCC-based-on-FPGA-Kintex
Voice Activity Detector based on MFCC features and DNN model |
|
Experimental |
| 4366 |
ccnixx/rt-stt-demo-app
Real-time speech-to-text web app. |
|
Experimental |
| 4367 |
innovatorved/tts-app
This application converts text or PDF documents into speech using the... |
|
Experimental |
| 4368 |
raghavkumar06/jarvis-ai-assistant
Python-based voice assistant that performs tasks using speech recognition... |
|
Experimental |
| 4369 |
EthanC/Eavesdrop
Discord Bot that transcribes voice messages and media attachments. Powered... |
|
Experimental |
| 4370 |
syado/discord-vc-tts
Discordのテキストチャンネルのメッセージをボイスチャンネルで読み上げるbot |
|
Experimental |
| 4371 |
ashfaaqrifath/Casper-PC-Assistant
PC assistant with voice/text control, automating tasks using APIs, system... |
|
Experimental |
| 4372 |
pilot7747/VoxDIY
This repository provides data and code for "Vox Populi, Vox DIY: Benchmark... |
|
Experimental |
| 4373 |
caiobd/sprite-ai
Sprite AI - An AI companion for your desktop |
|
Experimental |
| 4374 |
lugia19/renpyDialogToAudio
Takes a renpy dialog export and generates voices using elevenlabs |
|
Experimental |
| 4375 |
obro79/stormhacks
Deploy full-stack web apps with zero typing — just your voice. |
|
Experimental |
| 4376 |
Harras3/unhallucinated-faster-whisper
'unhallucinated-faster-whisper,' a powerful enhancement built on the... |
|
Experimental |
| 4377 |
antonin-lfv/ESP32-robot-piloting-with-TinySpeech
Offline Keyword Spotting on ESP32-S3. TinySpeech implementation using... |
|
Experimental |
| 4378 |
speak-rs/speakly
High-performance, extensible speech recognition toolkit for Rust — OpenAI... |
|
Experimental |
| 4379 |
TheMadMartina/Nexa
Nexa is a Python AI voice assistant leveraging speech recognition and... |
|
Experimental |
| 4380 |
zry98/pomumd
Wyoming Protocol TTS and STT & MLX LLM server for iOS/macOS |
|
Experimental |
| 4381 |
jakariaemon/WSI
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual... |
|
Experimental |
| 4382 |
serkanalgur/turkish-tts
Turkish TTS with Piper TTS |
|
Experimental |
| 4383 |
KrishnaDN/LAS-Pytorch
Implementation of the paper "Listen, Attend and Spell" Paper in Pytorch |
|
Experimental |
| 4384 |
deepgram-starters/php-text-to-speech
Get started using Deepgram's Text-to-Speech with this PHP demo app |
|
Experimental |
| 4385 |
CanadianCrafter/EngHacks2021-Text-To-Speech
Text to Speech Highlighter is a Chrome extension that allows the user to... |
|
Experimental |
| 4386 |
noly24/spoken-subtitles
"Chrome extension that reads subtitles aloud on streaming sites for accessibility" |
|
Experimental |
| 4387 |
adakrupp/voice-cloning
Local AI voice cloning with Coqui TTS XTTS-v2 - Docker-ready, GPU-accelerated |
|
Experimental |
| 4388 |
avreliusdante-web-creator/voice-input
Browser extension: convert voice to text and send it with one click in open... |
|
Experimental |
| 4389 |
klimromanyuk/tg-tts-sum-bot
Telegram bot with LLM (Ollama) and voice synthesis (Qwen3-TTS / Edge-TTS) |
|
Experimental |
| 4390 |
tqer39/tts-partner
TTS Partner repository |
|
Experimental |
| 4391 |
elerdg/ASR-for-low-resource-languages
Fine-tune wav2vec2-xls-r on data from low-resource-languages |
|
Experimental |
| 4392 |
horatio-sans-serif/speeker
TTS MCP, CLI, HTTP API with multiple engines, voice cloning, daemon for... |
|
Experimental |
| 4393 |
martins-vds/my-assistant
A voice-driven personal task-tracking assistant for tech workers who... |
|
Experimental |
| 4394 |
Atqarana/AI-Voicebot-for-Kids
An interactive companion toy that engages kids with storytelling, singing,... |
|
Experimental |
| 4395 |
d1pankarmedhi/CascadeS2S
A low-latency (<5s) cascade-style speech-to-speech conversational system |
|
Experimental |
| 4396 |
leonardofmed/stt-chat-tts
This is a Python project that uses different modules to capture audio from a... |
|
Experimental |
| 4397 |
sergicastellasape/gpt-reviews
Code for GPT Reviews — a daily AI-generated podcast |
|
Experimental |
| 4398 |
ThakkarVidhi/ai-banking-agent
Vaulta AI is a voice-driven banking agent that authenticates users through a... |
|
Experimental |
| 4399 |
Bsh54/AI_Phone_Call
Application web qui transforme la synthèse vocale traditionnelle en... |
|
Experimental |
| 4400 |
vaishnavipatil29/Voice-Chatbot
Voice Chatbot, Course Project, Speech Processing |
|
Experimental |