All Voice AI Tools
6,981 tools ranked by quality score · Page 24 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 2301 |
BayramAnnakov/gmail-to-podcast
Transform Gmail newsletters into AI-generated podcast conversations using... |
|
Emerging |
| 2302 |
adelacvg/detail_tts
All generative model in one for better TTS model |
|
Emerging |
| 2303 |
jonelo/jAdapterForNativeTTS
A simple pure Java library that allows you to use the native Text To Speech... |
|
Emerging |
| 2304 |
daslearning-org/text-to-speech-offline
A lightweight cross-platform Text-To-Speech application which works on... |
|
Emerging |
| 2305 |
FOLLGAD/reddit-video-maker
AI video content creation before it was cool |
|
Emerging |
| 2306 |
black-roland/homeassistant-salutespeech
SaluteSpeech integration for Home Assistant providing speech-to-text and... |
|
Emerging |
| 2307 |
tts-hub/monotonic_alignment_search
Monotonically align text and speech |
|
Emerging |
| 2308 |
botbahlul/js-live-audio-video-translate
HTML Web template that can RECOGNIZE any live audio/video streaming (using... |
|
Emerging |
| 2309 |
Otosaku/OtosakuTTS-iOS
Swift library for offline text-to-speech synthesis on iOS/macOS. Generate... |
|
Emerging |
| 2310 |
clarinsi/Slovene_ASR_e2e
Automatic Speech Recognition tool |
|
Emerging |
| 2311 |
loushou/flutter_tts_improved
A fork of the Flutter_TTS (https://github.com/dlutton/flutter_tts) plugin,... |
|
Emerging |
| 2312 |
Julia-Roman/pepega-tts
Discord bot for Google and Polly Text-to-Speech |
|
Emerging |
| 2313 |
jfainberg/lattice_combination
Lattice combination algorithm to combine inaccurate transcripts with... |
|
Emerging |
| 2314 |
linto-ai/linto-diarization
Speaker diarization service |
|
Emerging |
| 2315 |
chameleon-ai/vevo
Simple GUI for Amphion Vevo |
|
Emerging |
| 2316 |
rshahamiri/SpeechVision
Speech Vision (SV) is a Dysarthric Speech Recognition System that adopts a... |
|
Emerging |
| 2317 |
daanzu/kaldi_ag_training
Docker image and scripts for training finetuned or completely personal Kaldi... |
|
Emerging |
| 2318 |
SkyDocs/speaker-identification
Speaker Identification using Neural Net. |
|
Emerging |
| 2319 |
valeriorlandini/sonus
A Max/MSP package for sound experimentation and algorithmic composition |
|
Emerging |
| 2320 |
frrobledo/AutoDub
An advanced AI-powered tool that automatically translates and dubs YouTube... |
|
Emerging |
| 2321 |
sahu-adarsh/intervyu
Practice job interviews with Neerja, an AI interviewer powered by Claude.... |
|
Emerging |
| 2322 |
daswer123/silero-tts-enhanced
Silero TTS Enhanced is a Python library that enhances the original Silero... |
|
Emerging |
| 2323 |
sshh12/Recording-Bot
A bot built to record and transcribe audio fragments from Discord. |
|
Emerging |
| 2324 |
aws-samples/amazon-transcribe-email-workflow
An Amazon Transcribe demo for "speech-to-text" conversion performed through... |
|
Emerging |
| 2325 |
SaptakBhoumik/easySpeech
easySpeech is an open-source Python wrapper for google speech to text API... |
|
Emerging |
| 2326 |
OPEXGroup/ITCC.YandexSpeechKitClient
Cross-platform client for Yandex SpeechKit Cloud API |
|
Emerging |
| 2327 |
thc1006/whisper-colab-tpu-transcriber
High-performance Google Colab Notebook for fast & accurate audio... |
|
Emerging |
| 2328 |
geekgirljoy/PHP
Examples of my PHP Code |
|
Emerging |
| 2329 |
Abhishek-op/SR
💡Kivy-android speech recognition |
|
Emerging |
| 2330 |
cmsflash/deep-learning-sota
State-of-the-art results for deep learning tasks in various fields. |
|
Emerging |
| 2331 |
zsl24/Tacotron2-Mandarin-HiFiGAN
Implementation of TTS with combination of Tacotron2 and HiFi-GAN |
|
Emerging |
| 2332 |
PRITHIVSAKTHIUR/Vision-to-VibeVoice-en
A Gradio-based demo for end-to-end vision-to-speech inference: Extract text... |
|
Emerging |
| 2333 |
robmsmt/SpeechLoop
Many ASRs under one roof. With Benchmarking... answering the question. What... |
|
Emerging |
| 2334 |
JSON2Video/json2video-nodejs-sdk
Create videos programmatically in the cloud from NodeJS: add watermarks,... |
|
Emerging |
| 2335 |
NICEElevateAI/ElevateAIDotNetSDK
.Net core 6 SDK for ElevateAI |
|
Emerging |
| 2336 |
liou666/audiread
📻 A simple and user-friendly online TTS tool. (简单易用的在线文本转语音工具) |
|
Emerging |
| 2337 |
brailcom/speechd-el
Emacs speech and Braille output interface |
|
Emerging |
| 2338 |
ARAI-Telegram/teledash-backend-processing
Optional AI-powered features of Teledash, an open-source software for... |
|
Emerging |
| 2339 |
Audio-WestlakeU/UMA-ASR
This repository is the official implementation of unimodal aggregation (UMA)... |
|
Emerging |
| 2340 |
IndieCoderMM/smart-one-ai
🤖 AI assistant that can listen to user input and provide responses. It... |
|
Emerging |
| 2341 |
revsic/speechset
Numpy-librosa implementation of Speech dataset pipeline |
|
Emerging |
| 2342 |
abinashmeher999/voice-data-extract
A command line interface to combine text information from subtitles with... |
|
Emerging |
| 2343 |
jumon/pywer
A simple Python package to calculate word error rate (WER). |
|
Emerging |
| 2344 |
mikopbx/ModuleRHVoice
Text to speech voice generator by the RHVoice algoritm |
|
Emerging |
| 2345 |
tcsenpai/audiocoqui
A multilingual tool to convert PDF ebooks to audiobooks using XTTS v2 TTS... |
|
Emerging |
| 2346 |
eliangerard/simple-tts-mp3
Converts text to mp3 audio using google-tts-api, it hasn't a limit |
|
Emerging |
| 2347 |
ameerbadri/twilio-asr-realtime-dashboard
Twilio ASR and Intent Realtime Dashboard |
|
Emerging |
| 2348 |
nico-byte/whisper-web
The Whisper Web Transcription Server is a Python-based real-time... |
|
Emerging |
| 2349 |
rapidaai/rapida-python
Open-source Python SDK for real-time Voice AI, voice agents, streaming... |
|
Emerging |
| 2350 |
nemoramo/acoustic_model
This is a sub-repository in building to create acoustic model in Mandarin... |
|
Emerging |
| 2351 |
ndenicolais/SpeechAndText
Android application built with Kotlin and Jetpack Compose that shows how to... |
|
Emerging |
| 2352 |
twirapp/silero-tts-api-server
This is a simple server that uses Silero models to convert text to audio... |
|
Emerging |
| 2353 |
jorcelinojunior/whisper-vtt2srt
A robust WebVTT to SRT converter optimized for AI transcriptions (Whisper,... |
|
Emerging |
| 2354 |
LibraryOfCongress/speech-to-text-viewer
AWS Transcribe evaluation pipeline: bulk-process audio files and view the results |
|
Emerging |
| 2355 |
acyclics/speech-to-speech-translator
Enables a device to input speech from a microphone, translate speech to a... |
|
Emerging |
| 2356 |
build-with-groq/groq-voice-agent-template
A real-time voice AI agent built with Groq API that enables natural voice... |
|
Emerging |
| 2357 |
fano2458/Zhadiger-Kazakh-Language-AI
AI services project "Zhadiger" for Kazakh Language developed using NVIDIA... |
|
Emerging |
| 2358 |
Martouta/speech_processor
Speech-to-text from videos and audios (including youtube and tiktok links) |
|
Emerging |
| 2359 |
empowerai/fs-middlelayer-api
US Forest Service ePermit API |
|
Emerging |
| 2360 |
Adibian/ResGrad
Unofficial implementation of ResGrad: Residual Denoising Diffusion... |
|
Emerging |
| 2361 |
nonwill/GoldenDict-OCR
GoldenDict++: Optimizations for faster dictionary loading and searching,... |
|
Emerging |
| 2362 |
kurianbenoy/malayalam_asr_benchmarking
A study to benchmark whisper based ASRs in Malayalam |
|
Emerging |
| 2363 |
lottev1991/Project-AIdol-Public-English-Dataset
Public female English corpus used for Project AI❤dol |
|
Emerging |
| 2364 |
avarayr/yap-for-cursor
Yap for Cursor - Voice To Text integration for Cursor IDE |
|
Emerging |
| 2365 |
clloret/speaking-practice
An Android application to practice English pronunciation |
|
Emerging |
| 2366 |
amscotti/hn-podcaster
The HackerNews Podcaster is a JavaScript application that utilizes the power... |
|
Emerging |
| 2367 |
parthgupta1208/VoiceCraft
Voice Craft is a desktop AI assistance tool designed to help people with... |
|
Emerging |
| 2368 |
rafaelvalle/asrgen
Attacking Speaker Recognition with Deep Generative Models |
|
Emerging |
| 2369 |
I5UCC/VRCTextboxSTT
A SpeechToText application that uses OpenAI's whisper via faster-whisper to... |
|
Emerging |
| 2370 |
reyniel26/bleepy
Bleepy is a Python program that can block Tagalog and English profanity in... |
|
Emerging |
| 2371 |
qiujiali/lattice_rnn
Bi-directional Lattice Recurrent Neural Networks for Confidence Estimation |
|
Emerging |
| 2372 |
andi611/Conditional-SpecGAN-Tensorflow
Text-to-Speech Synthesis by Generating Spectrograms using Generative... |
|
Emerging |
| 2373 |
lars76/forced-alignment-chinese
Mandarin Chinese audio datasets aligned with Montreal Forced Aligner |
|
Emerging |
| 2374 |
Yeti47/Vosk4Unity
Vosk4Unity is a module for the Unity Engine that provides a simple way to... |
|
Emerging |
| 2375 |
xeden3/MSSpeechServer
MSSpeechServer is a REST server based on the Microsoft Speech Platform that... |
|
Emerging |
| 2376 |
doubleZ0108/Human-Computer-Interaction
Human-Computer Interaction | Tongji Univ. SSE Course Projects |
|
Emerging |
| 2377 |
deepily/genie-in-the-box
Genie in the Box: Distill Whisper STT => Mistral-7B =>... |
|
Emerging |
| 2378 |
EuleMitKeule/speaker-recognition
Speaker recognition service for Home Assistant using voice embeddings. Train... |
|
Emerging |
| 2379 |
jame25/Piper-Tray
Piper Tray is a lightweight system tray utility written in C# for use with Piper TTS. |
|
Emerging |
| 2380 |
scottgl9/openclaw-matrix-voice
Matrix voice call bot with LiveKit, Whisper STT, and Chatterbox TTS,... |
|
Emerging |
| 2381 |
zhihanyang2022/gender-audio-classification
A speaker gender classifier. MFC feature engineering and a pre-trained... |
|
Emerging |
| 2382 |
nezhar/speech-condenser
A tool for summarizing dialogues from videos or audio |
|
Emerging |
| 2383 |
FomTarro/word-salad
Twitch TTS redeem that uses sentence mixing instead of synthesis. |
|
Emerging |
| 2384 |
EmZod/Speak-Turbo
Ultra-fast local TTS for AI agents. ~90ms to first sound. |
|
Emerging |
| 2385 |
asaddi/f5-tts-serve
A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful... |
|
Emerging |
| 2386 |
slackr31337/wyoming-piper-gpu
Wyoming Piper docker container with GPU support for Home-Assistant |
|
Emerging |
| 2387 |
shafaypro/PYSHA
A Simple Virtual Assistant Build in Python 3.5 |
|
Emerging |
| 2388 |
NONAN23x/WhisperingNova
An AI voice changer harnessing the power of Open AI and VoiceVox for... |
|
Emerging |
| 2389 |
MiguelsPizza/local-transcription-mcp--parakeet-tdt-0.6b-v2--
Local MCP server that converts and transcribes video and audio files 100% on device |
|
Emerging |
| 2390 |
Rishav-Agarwal/Translate-Language_Translator
An android app that allows you to translate text and phrases between 90+... |
|
Emerging |
| 2391 |
legekka/GanyuTTS
A small VITS+SOVITS/RVC TTS API |
|
Emerging |
| 2392 |
Vishnu-tppr/NEXORA-AI
Made with Python, crafted by Vishnu 💻✨ Nexora AI – A smart Python voice... |
|
Emerging |
| 2393 |
koudounasalkis/AI4Voice
This repo contains the code for "Voice Disorder Analysis: A... |
|
Emerging |
| 2394 |
msalhab96/MultiSpeech
pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with... |
|
Emerging |
| 2395 |
xnmeet/voi
一个基于 [Bob](https://bobtranslate.com/) 的文本转语音插件,使用 Kokoro 本地部署模型作为语音合成服务。 |
|
Emerging |
| 2396 |
LuluW8071/Conformer
End-to-End Speech Recognition Training with Conformer CTC using PyTorch Lightning⚡ |
|
Emerging |
| 2397 |
skit-ai/speech-recognition
SDKs and docs for Skit's speech to text service |
|
Emerging |
| 2398 |
Yuan-ManX/ComfyUI-ChatterboxTTS
ComfyUI-ChatterboxTTS is now available in ComfyUI, Chatterbox is the first... |
|
Emerging |
| 2399 |
AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech |
|
Emerging |
| 2400 |
fernicar/Parakeet_GUI_TINS_Edition
A desktop application built using the TINS paradigm for transcribing audio... |
|
Emerging |