All Voice AI Tools
6,981 tools ranked by quality score · Page 14 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 1301 |
sayksii/Aria
ARIA - AI Realtime Intelligent Audio | Universal real-time AI subtitles for Windows |
|
Emerging |
| 1302 |
pschatzmann/arduino-espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than... |
|
Emerging |
| 1303 |
RapidAI/RapidTTS
A cross platform implementation of Text-to-Speech based on ONNXRuntime. |
|
Emerging |
| 1304 |
yeyupiaoling/VITS-Pytorch
本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,... |
|
Emerging |
| 1305 |
seanghay/speechviewer
A quick audio dataset viewer |
|
Emerging |
| 1306 |
1ytic/pytorch-edit-distance
Levenshtein edit-distance on PyTorch and CUDA |
|
Emerging |
| 1307 |
soupslurpr/Transcribro
Private and on-device speech recognition keyboard and service for Android. |
|
Emerging |
| 1308 |
AndroidMaryTTS/AndroidMaryTTS
Android MARY TTS - an open-source, offline HMM-Based text-to-speech... |
|
Emerging |
| 1309 |
RafalWilinski/serverless-medium-text-to-speech
🔊 Serverless-based, text-to-speech service for Medium articles |
|
Emerging |
| 1310 |
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and... |
|
Emerging |
| 1311 |
mmpneo/curses
Speech to Text and KB input captions for OBS, VRChat, Twitch chat and Discord |
|
Emerging |
| 1312 |
saidsef/tika-document-to-text
Apache Tika extract text and metadata from any document format with this... |
|
Emerging |
| 1313 |
Detoxfox4234/Qwen3-Voice-Factory
Local, portable GUI for Qwen3-TTS. Optimized for NVIDIA RTX 50 Series (CUDA... |
|
Emerging |
| 1314 |
gheyret/UQSpeechDataset
Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット |
|
Emerging |
| 1315 |
TeaPoly/Conformer-Athena
Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena. |
|
Emerging |
| 1316 |
jscrane/TTS
Arduino Text-to-Speech Library |
|
Emerging |
| 1317 |
p1an-lin-jung/teochew-g2p
这是一个潮州话文本端的处理工具和正字标准,主要为潮州方言的语音合成服务 |
|
Emerging |
| 1318 |
foamliu/Listen-Attend-Spell-v2
PyTorch implementation of Listen Attend and Spell Automatic Speech Recognition (ASR). |
|
Emerging |
| 1319 |
pymike00/YouTube-Tutorials
:open_file_folder: Source Code for (some of) the Programming Tutorials from... |
|
Emerging |
| 1320 |
danielclough/vibevoice-rs
Rust implementation of VibeVoice text-to-speech with voice cloning and... |
|
Emerging |
| 1321 |
lancejames221b/jarvis-voice
OpenJarvis — Real-time AI voice assistant for Discord. Talk to the same... |
|
Emerging |
| 1322 |
Kajitsy/Emilia
Emilia - Desktop Character.AI Client |
|
Emerging |
| 1323 |
RoySheffer/im2wav
Official implementation of the pipeline presented in I hear your true... |
|
Emerging |
| 1324 |
izwi-ai/izwi
On-device AI engine for transcription, TTS, and voice workflows. |
|
Emerging |
| 1325 |
zw76859420/ASR_Syllable
基于卷积神经网络的语音识别声学模型的研究 |
|
Emerging |
| 1326 |
Baidu-AIP/speech-tts-cors
百度语音 语音合成 跨域demo以及支持库 |
|
Emerging |
| 1327 |
hyeonsangjeon/computing-Korean-STT-error-rates
STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지 |
|
Emerging |
| 1328 |
speechio/BigCiDian
Pronunciation lexicon covering both English and Chinese languages for... |
|
Emerging |
| 1329 |
speechly/speechly
Client libraries, examples and demos of Speechly API for the Web. |
|
Emerging |
| 1330 |
SadeghKrmi/pertts-streamlit
Persian text-to-speech streamlit interface |
|
Emerging |
| 1331 |
adi-gov-tw/Taiwan-Tongues-ASR-CE
Taiwan Tongues ASR CE 是一個開源語音辨識(Automatic Speech Recognition,... |
|
Emerging |
| 1332 |
Warma10032/easytts
打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人... |
|
Emerging |
| 1333 |
dusty-nv/jetson-voice
ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch... |
|
Emerging |
| 1334 |
Pictalk-speech-made-easy/pictalk-frontend
Pictalk is an open-source application designed to assist individuals with... |
|
Emerging |
| 1335 |
reybahl/Assistant
A machine learning powered, voice-based virtual assistant for Raspberry Pi.... |
|
Emerging |
| 1336 |
hcy71o/SNAC
Unofficial Pytorch implementation of SNAC: Speaker-normalized affine... |
|
Emerging |
| 1337 |
adelacvg/ttts
Train the next generation of TTS systems. |
|
Emerging |
| 1338 |
agent87/RW-DEEPSPEECH-API
An end to end deep speech REST API containing speech to text and text speech... |
|
Emerging |
| 1339 |
double22a/asr_nlp_paper_code
Papers of ASR, Tools of ASR |
|
Emerging |
| 1340 |
abus-aikorea/kara-audio
Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports... |
|
Emerging |
| 1341 |
unilight/jatts
JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit |
|
Emerging |
| 1342 |
ShadowForests/VoiceToSpeech
Live speech recognition to synthesized speech with hundreds of voices, TTS,... |
|
Emerging |
| 1343 |
stts-se/wikispeech-server
The main API for Wikispeech |
|
Emerging |
| 1344 |
HurroWorld/text-to-audio2face
Web interface to convert text to speech and route it to an Audio2Face... |
|
Emerging |
| 1345 |
gillesdemey/google-speech-v2
:speech_balloon: Reverse Engineering Google's Speech To Text API (v2) |
|
Emerging |
| 1346 |
maum-ai/nuwave2
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling... |
|
Emerging |
| 1347 |
itspyguru/Tkinter-Applications
A collection of small tkinter apps made by me |
|
Emerging |
| 1348 |
elimu-ai/vitabu
📚 Android application for reading storybooks and expanding word vocabulary. |
|
Emerging |
| 1349 |
nerdaxic/glados-voice-assistant
DIY Voice Assistant based on the GLaDOS character from Portal video game... |
|
Emerging |
| 1350 |
n0th1ng-else/voice-to-text-bot
Telegram bot that converts Voice messages into text |
|
Emerging |
| 1351 |
mapluisch/OpenAI-Text-To-Speech-for-Unity
Implementation of OpenAI's Text-To-Speech in Unity. Synthesize any text and... |
|
Emerging |
| 1352 |
tarun7r/SpeechAlgo
A Comprehensive Speech Processing Algorithms Library for research and production use |
|
Emerging |
| 1353 |
belambert/asr-tools
Libraries and scripts for manipulating and handling ASR output/n-bests/etc. |
|
Emerging |
| 1354 |
racai-ai/RobinASR
Romanian Automatic Speech Recognition from the ROBIN project |
|
Emerging |
| 1355 |
rishikksh20/Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder |
|
Emerging |
| 1356 |
myuan19/voiceInput
Windows AI 语音输入🎙 — 按快捷键说话即输入,支持润色。摆脱打字限制,实现无拘束、高效率的表达。 |
|
Emerging |
| 1357 |
codyw912/open-asr-server
OpenAI-compatible ASR server with pluggable local backends (Parakeet,... |
|
Emerging |
| 1358 |
daanzu/speech-training-recorder
Simple GUI application to help record audio dictated from given text... |
|
Emerging |
| 1359 |
deepgram-starters/django-voice-agent
Get started using Deepgram's Voice Agent with this Django demo app |
|
Emerging |
| 1360 |
FR33TR1ST/VoiceAssistant
A VoiceAsistant with WhisperAI speech recognition |
|
Emerging |
| 1361 |
Labmem-Zhouyx/CDFSE_FastSpeech2
The Official Implementation of “Content-Dependent Fine-Grained Speaker... |
|
Emerging |
| 1362 |
sophiefy/StellaVoiceChanger
Deep-learning-based voice changer, supporting local inference. |
|
Emerging |
| 1363 |
mrtozner/vox
Local voice AI framework for Rust. Whisper + LLM + TTS with no cloud dependencies. |
|
Emerging |
| 1364 |
JollyToday/GhostCut-auto_video_translation
auto video translation-video translator can auto translate video hard... |
|
Emerging |
| 1365 |
cvqluu/TDNN
Time delay neural network (TDNN) implementation in Pytorch using unfold method |
|
Emerging |
| 1366 |
TranscribeJs/transcribe.js
Monorepo for Transcribe.js |
|
Emerging |
| 1367 |
mrf345/django_gtts
Django app extension to add gTTS google text-to-speech |
|
Emerging |
| 1368 |
westonruter/spoken-word
Spoken Word |
|
Emerging |
| 1369 |
felipefacundes/brasiltts
Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil,... |
|
Emerging |
| 1370 |
Yazdi9/TTS-MultiLingual
Text To Speech Multilingual Support (+20 Language) |
|
Emerging |
| 1371 |
john-carroll-sw/coffee-chat-voice-assistant
Coffee Chat Voice Assistant is a voice-driven ordering system powered by... |
|
Emerging |
| 1372 |
WismutHansen/READ2ME
Turn text from websites into spoken audio with edge-tts, F5, etc. and save... |
|
Emerging |
| 1373 |
jcrodriguez1989/heyshiny
Package: New `shiny` input that translates audio to text |
|
Emerging |
| 1374 |
xifan2333/fcitx5-vinput
Local offline voice input plugin for Fcitx5 |
|
Emerging |
| 1375 |
34j/neural-source-filter
Python package for NSF and NSF-HiFi-GAN (unofficial) |
|
Emerging |
| 1376 |
royshil/cloudvocal
Cloud AI live transcription and translation service plugin |
|
Emerging |
| 1377 |
syhw/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on... |
|
Emerging |
| 1378 |
thewh1teagle/phonikud-tts
phonikud-tts - text to speech in Hebrew |
|
Emerging |
| 1379 |
deepgram-starters/node-text-to-speech
Get started using Deepgram's Text-to-Speech with this Node demo app |
|
Emerging |
| 1380 |
AIFSH/ComfyUI-GPT_SoVITS
a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now |
|
Emerging |
| 1381 |
zhangzijie-pro/Speaker-Verification
Dual-model speech AI toolkit for speaker verification and speaker-aware... |
|
Emerging |
| 1382 |
google-research-datasets/TextNormalizationCoveringGrammars
Covering grammars for English and Russian text normalization |
|
Emerging |
| 1383 |
voicegain/python-sdk
Python SDK for working with Voicegain Speech-to-Text |
|
Emerging |
| 1384 |
Speaker-Identification/You-Only-Speak-Once
Deep Learning - one shot learning for speaker recognition using Filter Banks |
|
Emerging |
| 1385 |
stefantaubert/en-tts
Command-line interface and Python library for synthesizing English texts into speech. |
|
Emerging |
| 1386 |
AudioLLMs/AudioBench
AudioBench: A Universal Benchmark for Audio Large Language Models |
|
Emerging |
| 1387 |
saurabhshri/CCAligner
🔮 Word by word audio subtitle synchronisation tool and API. Developed under... |
|
Emerging |
| 1388 |
Executedone/Chinese-FastSpeech2
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏 |
|
Emerging |
| 1389 |
TrevorS/qwen3-tts-rs
Rust implementation of Qwen3-TTS speech synthesis |
|
Emerging |
| 1390 |
megaease/easevoice-trainer
EaseVoice Trainer is a simple and user-friendly voice cloning and speech... |
|
Emerging |
| 1391 |
kaieberl/paper2speech
Convert any english paper or scientific book to audio |
|
Emerging |
| 1392 |
SungFeng-Huang/Meta-TTS
Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More... |
|
Emerging |
| 1393 |
tugstugi/pytorch-speech-commands
Speech commands recognition with PyTorch | Kaggle 10th place solution in... |
|
Emerging |
| 1394 |
pkozul/ha-tts-bluetooth-speaker
TTS Bluetooth Speaker for Home Assistant |
|
Emerging |
| 1395 |
Jaffe2718/Microphone-Text-Input
A fabric mod that can recognize speech as text messages and automatically... |
|
Emerging |
| 1396 |
JasonLovesDoggo/Flow
Native MacOS dictation that captures audio, transcribes speech, and formats... |
|
Emerging |
| 1397 |
ORI-Muchim/Efficient-Speech
Lightweight Korean TTS Model based on FastSpeech2 |
|
Emerging |
| 1398 |
dokterbob/macos-speech-server
Local, fast and efficient Speech to Text (STT) and Text to Speech (TTS) on... |
|
Emerging |
| 1399 |
awexandrr/audioWhisper
Listen to any audio stream on your machine and print out the transcribed or... |
|
Emerging |
| 1400 |
asticode/go-astibob
Golang framework to build an AI that can understand and speak back to you,... |
|
Emerging |