All Voice AI Tools
6,981 tools ranked by quality score · Page 17 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 1601 |
DarioFT/ComfyUI-Qwen3-TTS
A ComfyUI custom node suite for Qwen3-TTS, supporting 1.7B and 0.6B models,... |
|
Emerging |
| 1602 |
rerender2021/echo
A simple asr translator powered by avernakis react. |
|
Emerging |
| 1603 |
Wendison/FCL-taco2
Official implementation of FCL-taco2: Fast, Controllable and Lightweight... |
|
Emerging |
| 1604 |
ScottishFold007/Cosyvoice_DPO_NOTES
CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO... |
|
Emerging |
| 1605 |
chenwr727/Stock-Insight-AI
Stock-Insight-AI 一键生成股票与期货分析视频 |
|
Emerging |
| 1606 |
SamYuan1990/flet_sherpa_onnx
flet_sherpa_onnx an ASR/STT library for flet basing on sherpa-onnx |
|
Emerging |
| 1607 |
byhow/yanyu
A Text-to-Speech node package with pinyin audio library. |
|
Emerging |
| 1608 |
DrAchernar/location-based-AR-app
This Flutter project is an example for a location based AR app with... |
|
Emerging |
| 1609 |
sq2ips/sr0wx
Unowocześniony projekt automatycznej radioamatorskiej stacji pogodowej sr0wx |
|
Emerging |
| 1610 |
LlmKira/fast-langdetect
⚡️ 80x faster Fasttext language detection out of the box | Split text by language |
|
Emerging |
| 1611 |
rcspam/dictee
Push-to-talk voice dictation for Linux — 100% local, multilingual (25+... |
|
Emerging |
| 1612 |
BobRandomNumber/ComfyUI-DiaTTS
ComfyUI Dia safetensors implementation |
|
Emerging |
| 1613 |
Kalebu/image-to-sound-python-
A python project for converting an Image into audible sound using OCR and... |
|
Emerging |
| 1614 |
Kalebu/Python-Speech-Recognition-
This consist of basic examples of performing Speech Recognition in Python... |
|
Emerging |
| 1615 |
Gmzxdotzz/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web... |
|
Emerging |
| 1616 |
dpm76/QuickRouteMap
Simple route guidance application. |
|
Emerging |
| 1617 |
KevKibe/African-Whisper
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual... |
|
Emerging |
| 1618 |
DeutscheKI/tevr-asr-tool
State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines... |
|
Emerging |
| 1619 |
ninjahuttjr/hal-answering-service
I'm sorry, Dave. I'm afraid I can't let that spam call through. — Local AI... |
|
Emerging |
| 1620 |
nestyme/Subtitles-generator
generates transcript for video from link |
|
Emerging |
| 1621 |
elbruno/ElBruno.Realtime
Pluggable real-time audio conversation framework for .NET. Local VAD, STT,... |
|
Emerging |
| 1622 |
bdim404/Qwen3-TTS-WebUI
基于阿里巴巴 Qwen3-TTS 模型(17 亿参数)的全栈文本转语音 Web 应用,支持语音定制、语音设计和语音克隆,有声书生成功能。A... |
|
Emerging |
| 1623 |
timoil/whisper-subtitles
🎬 AI-powered localhost subtitle generator for hearing-impaired users.... |
|
Emerging |
| 1624 |
mozi1924/Qwen3-TTS-EasyFinetuning
Easy fine-tuning for Qwen3-TTS: Fast voice cloning and high-quality... |
|
Emerging |
| 1625 |
sai9640nayak/StreamingKokoroJS
Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100%... |
|
Emerging |
| 1626 |
Amanbig/ChatMe
ChatMe combines agent-driven AI, cross-platform responsiveness, and voice... |
|
Emerging |
| 1627 |
Sri-Krishna-V/Elu
AI-powered Chrome extension that makes any web article accessible —... |
|
Emerging |
| 1628 |
scripty-bot/scripty
Speech to text bot for Discord |
|
Emerging |
| 1629 |
mxvsh/wave
Native macOS dictation app focused on fast voice-to-text workflows. |
|
Emerging |
| 1630 |
drivendataorg/childrens-speech-recognition-benchmark-pub
Tutorial code for the On Top of Pasketti: Children’s Speech Recognition Challenge |
|
Emerging |
| 1631 |
HachiroSan/google-pronouncer
🔊 Download pronunciation audio files from Google's dictionary service.... |
|
Emerging |
| 1632 |
saurabhdaware/bol
Slightly more consistent Text-to-speech for Web and a wrapper around speechSynthesis |
|
Emerging |
| 1633 |
gittyeric/FAlexa
Create your own verbal commands that fuzzily map to custom Javascript /... |
|
Emerging |
| 1634 |
PhuocElec/zipformer-asr-api
REST-API implementation of ZipFormer for automatic speech recognition (ASR)... |
|
Emerging |
| 1635 |
mahimairaja/openrtc-python
OpenRTC lets developers run multiple LiveKit voice agents in one Python... |
|
Emerging |
| 1636 |
wildminder/ComfyUI-KaniTTS
ComfyUI node for modular, human‑like Kani TTS. Generate natural,... |
|
Emerging |
| 1637 |
ALERTua/styletts2-ukrainian-openai-tts-api
OpenAI TTS Compatible Ukrainian TTS StyleTTS2 Pipeline |
|
Emerging |
| 1638 |
kaloprojects/KALO-ESP32-Voice-Assistant
Code snippets showing how to record I2S audio and store as .wav file on... |
|
Emerging |
| 1639 |
Sundy1219/ctc_beam_search_lm
CTC+Beam_Search+kenlm 是用于以汉字为声学模型建模单元的解码系统 |
|
Emerging |
| 1640 |
ayutaz/uPiper
Unity TTS plugin: Piper neural synthesis + pure C# G2P (Japanese/English) +... |
|
Emerging |
| 1641 |
mgonzs13/piper_ros
piper Text-to-Speech for ROS 2 |
|
Emerging |
| 1642 |
MahtaFetrat/ManaTTS-Persian-Speech-Dataset
ManaTTS is the largest open Persian speech dataset with 114+ hours of... |
|
Emerging |
| 1643 |
shanghaimoon888/mod_vadasr
This is FreeSwitch module that can do VAD and ASR with IFLYTEK websocket api. |
|
Emerging |
| 1644 |
sooftware/lightning-asr
Modular and extensible speech recognition library leveraging... |
|
Emerging |
| 1645 |
vectominist/MiniASR
A mini, simple, and fast end-to-end automatic speech recognition toolkit. |
|
Emerging |
| 1646 |
Voice-Privacy-Challenge/Voice-Privacy-Challenge-2020
Baseline Recipe for VoicePrivacy Challenge 2020:... |
|
Emerging |
| 1647 |
jcsilva/docker-kaldi-android
Dockerfile for compiling Kaldi for Android. |
|
Emerging |
| 1648 |
ArchitParnami/Few-Shot-KWS
Few-Shot Keyword Spotting |
|
Emerging |
| 1649 |
yh1008/speech-to-text
mixlingual speech recognition system; hybrid (GMM+NNet) model; Kaldi + Keras |
|
Emerging |
| 1650 |
Franck-Dernoncourt/ASR_benchmark
Program to benchmark various speech recognition APIs |
|
Emerging |
| 1651 |
wulee510505/Text2Speach
一句代码搞定语音合成,文字转语音 |
|
Emerging |
| 1652 |
1038lab/ComfyUI-VoxCPMTTS
A clean, efficient ComfyUI custom node for VoxCPM TTS (Text-to-Speech)... |
|
Emerging |
| 1653 |
vigonotion/tts.astromech
Text to Astromech integration for Home Assistant (R2D2 Beep Boop Sounds) |
|
Emerging |
| 1654 |
nixonyh/UnityASR
Automatic Speech Recognition in Unity. |
|
Emerging |
| 1655 |
ga642381/FastSpeech2
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to... |
|
Emerging |
| 1656 |
usabarashi/voicevox-cli
Japanese text-to-speech using VOICEVOX Core |
|
Emerging |
| 1657 |
LearnedVector/Wav2Letter
Speech Recognition model based off of FAIR research paper built using Pytorch. |
|
Emerging |
| 1658 |
b4rtaz/voice-assistant
Voice assistant for Visual Studio Code. |
|
Emerging |
| 1659 |
cdimascio/watson-html5-speech-recognition
Speech Recognition for Browsers via Webkit, HTML5, and Watson |
|
Emerging |
| 1660 |
echonoshy/tingshu
Tingshu 听舒 | Bringing the author’s voice directly to you |
|
Emerging |
| 1661 |
stefantaubert/pronunciation-dictionary-utils
Utils to modify pronunciation dictionaries. |
|
Emerging |
| 1662 |
smartgic/docker-mycroft
Mycroft AI Voice Assistant Docker images and docker-compose.yml files for... |
|
Emerging |
| 1663 |
wblgers/hmm_speech_recognition_demo
A demo for simple isolated Chinese speech word recognition using GMMHMM in Python |
|
Emerging |
| 1664 |
bgArray/ZhiYin
知音 - AI音频听觉功能集成软件。提供声乐技术识别分析、伴奏分离等伴奏多种工具。 |
|
Emerging |
| 1665 |
lissettecarlr/kuon
久远:一个开发中的大模型语音助手,当前关注易用性,简单上手,支持对话选择性记忆和Model Context Protocol (MCP)服务。... |
|
Emerging |
| 1666 |
Robofied/Voicenet
Comprehensive Python library for speech and voice. |
|
Emerging |
| 1667 |
Hagsten/Talkify
Javascript Text to speech library |
|
Emerging |
| 1668 |
amd/LIRA
This tool helps you easily deploy ASR models on NPUs on AMD's Ryzen AI 300... |
|
Emerging |
| 1669 |
aviaryan/Very-Fast-Dictation
Instant dictation app for Mac |
|
Emerging |
| 1670 |
jsugg/ser
The AI-powered ser Python package is a tool for recognizing and analyzing... |
|
Emerging |
| 1671 |
rorpage/openfaas-text-to-speech
Generate an MP3 of text using Google's Text-to-Speech |
|
Emerging |
| 1672 |
CoffeeMethod/KokoroGUI
An advanced TTS software, built for audiobooks, podcasts, videos, and more. |
|
Emerging |
| 1673 |
soundhound/hound-sdk-web-example
An example of how to work with text and voice requests using the Houndify... |
|
Emerging |
| 1674 |
silenterus/deepspeech-cleaner
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework |
|
Emerging |
| 1675 |
Picovoice/speech-to-intent-benchmark
benchmark for Speech-to-Intent engines |
|
Emerging |
| 1676 |
pingfury108/book2tts
有声书制作工具 |
|
Emerging |
| 1677 |
t0mer/tts-stt
Small pyhon flask container allowing us to convert Text to Speech and Speech to Text |
|
Emerging |
| 1678 |
ahaocd/davinci-voice-clone
DaVinci Subtitle Alignment + Voice Clone + AI Emotion Optimization | CosyVoice2 TTS |
|
Emerging |
| 1679 |
niteshsharmacodes/neutts-ultimate
NeuTTS-Ultimeate - Advanced Text-to-Speech generation with unlimited... |
|
Emerging |
| 1680 |
DKMitt/speech-to-text-js
The Voice Note App's purpose is to experiment with the Web Speech API by... |
|
Emerging |
| 1681 |
goodmike31/pl-asr-bigos-tools
Extendable toolkit for comprehensive evaluation of ASR systems. Currently... |
|
Emerging |
| 1682 |
gaborvecsei/whisper-live-transcription
Live-Transcription (STT) with Whisper PoC |
|
Emerging |
| 1683 |
resemble-ai/resemble-unity-text-to-speech
Resemble's voice cloning engine within Unity |
|
Emerging |
| 1684 |
cottongeeks/podscript
Generate podcast transcripts using language and speech-to-text models |
|
Emerging |
| 1685 |
atosystem/SpeechCLIP
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model,... |
|
Emerging |
| 1686 |
titilambert/pynuance
Wrapper for Nuance Communications services |
|
Emerging |
| 1687 |
Degon3399/XTTS_V2
This repository offers a framework for fine-tuning the XTTS_V2 model,... |
|
Emerging |
| 1688 |
DarkPancakes/clipforge
AI-powered short-form video generator. Create viral YouTube Shorts & TikTok... |
|
Emerging |
| 1689 |
grebtsew/Text_To_Speech_Server_Node
A super simple speaking server node that receives requests and reads them... |
|
Emerging |
| 1690 |
vkosuri/dialogflow-lite
[Maintainer Required] A light-weight python library REST agent for Dialogflow |
|
Emerging |
| 1691 |
IBM/watson-streaming-stt
Example of using Watson's Streaming Speech to Text websockets interface for... |
|
Emerging |
| 1692 |
takahi-ro/ConvivialChat
This system provides the web space where text and speech coexist, and you... |
|
Emerging |
| 1693 |
mikopbx/ModuleSmartIVR
Модуль умной маршрутизации для 1C:Предприятия |
|
Emerging |
| 1694 |
deepgram-starters/go-voice-agent
Get started using Deepgram's Voice Agent with this Go demo app |
|
Emerging |
| 1695 |
Skeli010/GaryTTS
强大免费的本地文本转语音软件 |
|
Emerging |
| 1696 |
beyondwords-io/wordpress-plugin
BeyondWords is the AI voice platform that brings frictionless audio... |
|
Emerging |
| 1697 |
hopkira/k9
Latest main K9 robot repository with 3D vision, local STT/TTS with GPT-3 and... |
|
Emerging |
| 1698 |
seanghay/KLEA
An open-source Khmer Word to Speech Model. Just single word not sentence! |
|
Emerging |
| 1699 |
alam025/ai-voice-assistant-appointment-booking
Enterprise-grade AI voice assistant for automated appointment scheduling... |
|
Emerging |
| 1700 |
richardassar/SampleRNN_torch
Torch implementation of SampleRNN: An Unconditional End-to-End Neural Audio... |
|
Emerging |