All Voice AI Tools
6,981 tools ranked by quality score · Page 12 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 1101 |
mastashake08/speech-kit
Simplifying the Speech Synthesis and Speech Recognition engines for... |
|
Emerging |
| 1102 |
ranchlai/mandarin-tts
Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 ,... |
|
Emerging |
| 1103 |
tikhonp/yandex-speechkit-lib-python
Python SDK for Yandex Speechkit API. |
|
Emerging |
| 1104 |
nipponjo/tts_arabic
🎙️ Arabic TTS models (FastPitch, Mixer-TTS) in the ONNX format — Python... |
|
Emerging |
| 1105 |
apinge/MeloTTS.cpp
A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO,... |
|
Emerging |
| 1106 |
WanderingAstronomer/Vociferous
Vociferous captures audio from your microphone, transcribes it in real-time... |
|
Emerging |
| 1107 |
vb000/Waveformer
A deep neural network architecture for low-latency audio processing |
|
Emerging |
| 1108 |
skirdey/voicerestore
VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration |
|
Emerging |
| 1109 |
atomicoo/PTTS-WebAPP
Parallel TTS web demo based on Flask + Vue (Vuetify). 基于 Flask + Vue 的语音合成单网页演示项目。 |
|
Emerging |
| 1110 |
fishaudio/docs
Official documentation for products, services, and projects by Fish Audio |
|
Emerging |
| 1111 |
SILMA-AI/silma-tts
SILMA TTS v1 Official Repo — a Lightweight Open Bilingual Text to Speech Model |
|
Emerging |
| 1112 |
tabahi/formantfeatures
Extract frequency, power, width and dissonance of formants from wav files |
|
Emerging |
| 1113 |
aviaryan/voice-writing-electron
A real-time, instant dictation desktop application built on Electron that... |
|
Emerging |
| 1114 |
Gyyyn/OpenWebTTS
Open source Speechify alternative. Read PDFs and EPUBs with local models. |
|
Emerging |
| 1115 |
scart97/thunder-speech
A Hackable speech recognition library. |
|
Emerging |
| 1116 |
CodersCreative/natural-tts
A rust crate for easily implementing Text-To-Speech into your rust programs. |
|
Emerging |
| 1117 |
TigreGotico/phoonnx
A Python library for multilingual phonemization and Text-to-Speech (TTS)... |
|
Emerging |
| 1118 |
aahl/qwen-tts2api
🗣️ Qwen TTS to OpenAI Speech API |
|
Emerging |
| 1119 |
sc0ty/subsync
Subtitle Speech Synchronizer |
|
Emerging |
| 1120 |
showlab/whisperVideo
Find out who said what in the video. |
|
Emerging |
| 1121 |
Purfview/whisper-standalone-win
Whisper & Faster-Whisper standalone executables for those who don't want to... |
|
Emerging |
| 1122 |
Bebra777228/PolGen-RVC
Преобразование голоса на основе VITS. Ориентировано на простоту, качество и... |
|
Emerging |
| 1123 |
nvidia-riva/common
Protocol buffers and other common resources. |
|
Emerging |
| 1124 |
spotify/basic-pitch-ts
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection. |
|
Emerging |
| 1125 |
jinserk/pytorch-asr
ASR with PyTorch |
|
Emerging |
| 1126 |
lperezmo/real-time-translator
A quick app to translate speech in real time using the Whisper API for... |
|
Emerging |
| 1127 |
CiscoDevNet/g2p_seq2seq_pytorch
Grapheme to phoneme model for PyTorch |
|
Emerging |
| 1128 |
USStateDept/State-TalentMAP
A comprehensive research, bidding, and matching system to match Foreign... |
|
Emerging |
| 1129 |
NateRickard/Xamarin.Cognitive.Speech
A client library that makes it easy to work with the Microsoft Cognitive... |
|
Emerging |
| 1130 |
SteTR/Emost-Bot
Discord Music Bot using Voice Recognition to receive commands. |
|
Emerging |
| 1131 |
SlashNephy/SimpleVoiceroid2Proxy
VOICEROID 2 を HTTP API で操作できます |
|
Emerging |
| 1132 |
rafaballerini/AssistentePessoal
Assistente pessoal virtual desenvolvida com Python 🤖 |
|
Emerging |
| 1133 |
mailong25/self-supervised-speech-recognition
speech to text with self-supervised learning based on wav2vec 2.0 framework |
|
Emerging |
| 1134 |
mapbox/mapbox-speech-swift
Natural-sounding text-to-speech in Swift or Objective-C on iOS, macOS, tvOS,... |
|
Emerging |
| 1135 |
litongjava/whisper-cpp-server
whisper-cpp-serve Real-time speech recognition and c+ of OpenAI's Whisper... |
|
Emerging |
| 1136 |
wq2012/SpeakerRecognitionFromScratch
Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家 |
|
Emerging |
| 1137 |
LynxLine/qtspeech
QtSpeech is cross-platform library based on Qt to provide common... |
|
Emerging |
| 1138 |
MyrtleSoftware/deepspeech
A PyTorch implementation of DeepSpeech and DeepSpeech2. |
|
Emerging |
| 1139 |
keonlee9420/Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning... |
|
Emerging |
| 1140 |
drankush/VoxRad
VOXRAD is a voice transcription application for radiologists leveraging... |
|
Emerging |
| 1141 |
overcrash66/OpenTranslator
Open Translator: Speech To Speech and Speech to text Translator with voice... |
|
Emerging |
| 1142 |
mobilequickie/AmazonSpeechTranslator
End-to-end Solution for Speech Recognition, Text Translation, and... |
|
Emerging |
| 1143 |
charlesliucn/awesome-end2end-asr
💬 A list of End-to-End speech recognition, including papers, codes and other... |
|
Emerging |
| 1144 |
keonlee9420/Daft-Exprt
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across... |
|
Emerging |
| 1145 |
XimilalaXiang/DeLive
DeLive is a cross-platform desktop app that captures system audio output and... |
|
Emerging |
| 1146 |
IBM/BigLittleNet
Official repository for Big-Little Net |
|
Emerging |
| 1147 |
spokestack/react-native-spokestack
Spokestack: give your React Native app a voice interface! |
|
Emerging |
| 1148 |
everydaycodings/MimicMania
MimicMania is a web application that allows you to generate speech and clone... |
|
Emerging |
| 1149 |
weespin/WillFromAfarDownloader
acapellabox pwned. |
|
Emerging |
| 1150 |
mush42/optispeech
A lightweight end-to-end text-to-speech model |
|
Emerging |
| 1151 |
fizamusthafa/whisper-app
This repository contains a web application for multi-lingual transcription... |
|
Emerging |
| 1152 |
ActiveNick/Unity-MS-SpeechSDK
Sample Unity project used to demonstrate Speech Recognition using the new... |
|
Emerging |
| 1153 |
techiaith/pyfestival
Amlapiwr Python C ar gyfer hwyluso rhaglennu gyda Festival | A Python C... |
|
Emerging |
| 1154 |
DanRuta/xVA-Synth
Machine learning based speech synthesis Electron app, with voices from... |
|
Emerging |
| 1155 |
domesticatedviking/TextyMcSpeechy
Easily create Piper text-to-speech models in any voice. Make a... |
|
Emerging |
| 1156 |
jackaduma/LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model ... |
|
Emerging |
| 1157 |
moshehbenavraham/Voice-Agent-PuPuPlatter
Multi-provider voice AI showcase featuring 7 providers (ElevenLabs + Widget,... |
|
Emerging |
| 1158 |
NeuralFalconYT/Video-Dubbing
Since most video dubbing services are paid, this project explores an... |
|
Emerging |
| 1159 |
patrickmonteiro/quasar-speech-api
🎤 🔉 Projeto de um SPA desenvolvido com Quasar Framework 1.0 + Speech API... |
|
Emerging |
| 1160 |
sberdevices/smart_app_framework
SmartApp Framework для создания навыков семейства Виртуальных Ассистентов... |
|
Emerging |
| 1161 |
puff-dayo/Kokoro-82M-Android
A minimal Android demo app for Kokoro-TTS |
|
Emerging |
| 1162 |
sksalahuddin2828/AI_Personal_Digital_Assistant
AI Personal Voice Assistant Project (Male - Female version) |
|
Emerging |
| 1163 |
voice-engine/make-a-smart-speaker
A collection of resources to make a smart speaker |
|
Emerging |
| 1164 |
astramind-ai/Auralis
A Fast TTS Engine |
|
Emerging |
| 1165 |
MikeyParton/react-speech-kit
React hooks for Speech Recognition and Speech Synthesis |
|
Emerging |
| 1166 |
Yuan-ManX/audio-development-tools
Audio Development Tools (ADT) is a project for advancing sound, speech, and... |
|
Emerging |
| 1167 |
Pranjalya/tts-tortoise-gradio
A Gradio setup for Tortoise TTS. |
|
Emerging |
| 1168 |
Aivis-Project/AIVM-Generator
Aivis Voice Model File (.aivm/.aivmx) Generator / Editor |
|
Emerging |
| 1169 |
Emotional-Text-to-Speech/hmm-for-emo-tts
:computer: A repository with comprehensive instructions for using the... |
|
Emerging |
| 1170 |
pulijon/Sttcast
Transcription from mp3 files to html with or without embedded player |
|
Emerging |
| 1171 |
rudrankriyam/Glosik
Sample project for F5-TTS using MLX Swift |
|
Emerging |
| 1172 |
rtzr/Awesome-Korean-Speech-Recognition
한국어 음성인식 STT API 리스트. 각 성능 벤치마크. |
|
Emerging |
| 1173 |
AkojimaSLP/Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR |
|
Emerging |
| 1174 |
tuan3w/cnn_vocoder
A fast cnn-based vocoder |
|
Emerging |
| 1175 |
1neReality/MITSUHA
World's First Multilingual Inexpensive Therapeutic Sophisticated... |
|
Emerging |
| 1176 |
revdotcom/reverb
Open source inference code for Rev's model |
|
Emerging |
| 1177 |
solaoi/lycoris
Real-time speech recognition & AI-powered note-taking app for macOS with... |
|
Emerging |
| 1178 |
JoelShine/Jarvis-v2.0
This is a major update of my project JARVIS-The-Ultimate-Project. You can... |
|
Emerging |
| 1179 |
TheMorpheus407/OpenAI-Audiobook-Generator
This project is a web-based application that converts text into audio,... |
|
Emerging |
| 1180 |
ardha27/AI-Waifu-Vtuber
AI Vtuber for Streaming on Youtube/Twitch |
|
Emerging |
| 1181 |
pika-online/AESRC2020
a deep accent recognition network |
|
Emerging |
| 1182 |
1038lab/ComfyUI-SparkTTS
ComfyUI-SparkTTS is a custom ComfyUI node implementation of SparkTTS, an... |
|
Emerging |
| 1183 |
Edw590/VISOR---A-Voice-Assistant
V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory! |
|
Emerging |
| 1184 |
lucko515/speech-recognition-neural-network
This is the end-to-end Speech Recognition neural network, deployed in Keras.... |
|
Emerging |
| 1185 |
hhguo/MSMC-TTS
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS |
|
Emerging |
| 1186 |
OpenMOSS/MOSS-Audio-Tokenizer
MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on... |
|
Emerging |
| 1187 |
jianchang512/fireredasr-ui
一个中文语音转文字项目,封装自FireRedASR |
|
Emerging |
| 1188 |
tihu-nlp/tihu
Persian Text-To-Speech |
|
Emerging |
| 1189 |
FontaineRiant/wrAIter
AI writing assistant with voiced narrator and characters and an illustrator |
|
Emerging |
| 1190 |
WangHelin1997/SSR-Speech
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis |
|
Emerging |
| 1191 |
cameronking4/VapiBlocks
Vapi Blocks is a library of components & api snips to copy and paste into... |
|
Emerging |
| 1192 |
shenbengit/TTSTool
科大讯飞离线语音,Text to Speech,TTS |
|
Emerging |
| 1193 |
alan890104/sumi
Sumi — Free, open-source voice dictation for macOS. Local-first Whisper +... |
|
Emerging |
| 1194 |
zeropointnine/tts-audiobook-tool
Audiobook creation tool with support for multiple TTS models (Qwen3-TTS,... |
|
Emerging |
| 1195 |
kokimame/joytan
Creative Audio/Textbook Maker 🎵 📖 See our YouTube channel |
|
Emerging |
| 1196 |
georgezhao2010/apple_airplayer
Make your AirPlay devices as TTS speakers |
|
Emerging |
| 1197 |
pth2000/PowerPointReviewer
一个基于PySide6实现的演讲稿朗读审阅工具,使用TTS引擎朗读PPT中的备注部分,从而辅助您进一步完善演讲的内容与措辞,助您顺利完成精彩的PPT演讲与展示。 |
|
Emerging |
| 1198 |
ssssssilver/sherpa-ncnn-unity
在Unity环境下,借助sherpa-ncnn框架,实现实时并准确的中英双语语音识别功能。 |
|
Emerging |
| 1199 |
TETYYS/SAPI4
Web interface for Microsoft Sam & friends |
|
Emerging |
| 1200 |
MainRo/docker-deepspeech-server
A dockerfile to run deepspeech-server |
|
Emerging |