All Voice AI Tools
6,981 tools ranked by quality score · Page 8 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 701 |
YoavRamon/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using... |
|
Emerging |
| 702 |
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained... |
|
Emerging |
| 703 |
Jackiexiao/zhtts
A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo |
|
Emerging |
| 704 |
Purple-Horizons/openclaw-voice
🦞 Open-source browser-based voice chat for AI assistants. Self-hosted,... |
|
Emerging |
| 705 |
wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning |
|
Emerging |
| 706 |
AppDevGuy/OSSSpeechKit
OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech. |
|
Emerging |
| 707 |
mdangschat/ctc-asr
End-to-end trained speech recognition system, based on RNNs and the... |
|
Emerging |
| 708 |
keonlee9420/Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive... |
|
Emerging |
| 709 |
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we... |
|
Emerging |
| 710 |
GeekyWizKid/video_processing_service
Video Processing Service is an automated video processing service that... |
|
Emerging |
| 711 |
soobinseo/Tacotron-pytorch
Pytorch implementation of Tacotron |
|
Emerging |
| 712 |
PhamHuynhAnh16/Vietnamese-RVC
Dự án công cụ chuyển đổi giọng nói dành cho người Việt |
|
Emerging |
| 713 |
rishikksh20/AdaSpeech
AdaSpeech: Adaptive Text to Speech for Custom Voice |
|
Emerging |
| 714 |
madhavmk/Noise2Noise-audio_denoising_without_clean_training_data
Source code for the paper titled "Speech Denoising without Clean Training... |
|
Emerging |
| 715 |
Umesh-01/Python-Assistant
Python Assistant (PA) is a voice command based assistant service written in... |
|
Emerging |
| 716 |
Deepest-Project/MelNet
Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain" |
|
Emerging |
| 717 |
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation.... |
|
Emerging |
| 718 |
xue-fei/sherpa-onnx-unity
sherpa-onnx in unity |
|
Emerging |
| 719 |
chenliangrui/EasyMrcp
欢迎使用EasyMrcp! EasyMrcp使用java编写,目前提供了多种不同的asr和tts的集成,做到真正简单使用ASR和TTS。... |
|
Emerging |
| 720 |
travisvn/obsidian-edge-tts
Free, high quality text-to-speech for your Obsidian notes, leveraging... |
|
Emerging |
| 721 |
alexruperez/SpeechRecognizerButton
UIButton subclass with push to talk recording, speech recognition and... |
|
Emerging |
| 722 |
themanyone/whisper_dictation
Private voice keyboard, AI chat, images, webcam, recordings, voice control... |
|
Emerging |
| 723 |
nari-labs/dia2
TTS model capable of streaming conversational audio in realtime. |
|
Emerging |
| 724 |
rishikksh20/VocGAN
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested... |
|
Emerging |
| 725 |
maxwellobi/Android-Speech-Recognition
Continuous speech recognition library for Android with options to use... |
|
Emerging |
| 726 |
slp-rl/aero
This repo contains the official PyTorch implementation of "Audio Super... |
|
Emerging |
| 727 |
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model.... |
|
Emerging |
| 728 |
leaonline/easy-speech
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no... |
|
Emerging |
| 729 |
freewym/espresso
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit |
|
Emerging |
| 730 |
atomiechen/FunASR-Client
Really easy-to-use Python client for FunASR runtime server. |
|
Emerging |
| 731 |
pluja/whishper
Transcribe any audio to text, translate and edit subtitles 100% locally with... |
|
Emerging |
| 732 |
jianchang512/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to... |
|
Emerging |
| 733 |
dhruvyad/uttertype
Short code for dictation using OpenAI Whisper for transcription. |
|
Emerging |
| 734 |
chenkui164/FastASR
这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。... |
|
Emerging |
| 735 |
shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting... |
|
Emerging |
| 736 |
jpreprocess/jbonsai
Voice synthesis library for Text-to-Speech applications (Currently HTS... |
|
Emerging |
| 737 |
acoti/articulate.js
A jQuery plugin that lets the browser speak to you. |
|
Emerging |
| 738 |
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous... |
|
Emerging |
| 739 |
Evil0ctal/Fast-Powerful-Whisper-AI-Services-API
⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper... |
|
Emerging |
| 740 |
duncan3dc/speaker
A PHP library to convert text to speech using various web services |
|
Emerging |
| 741 |
Kardbord/hfapigo
Unofficial (Golang) Go bindings for the Hugging Face Inference API |
|
Emerging |
| 742 |
neosapience/mlp-singer
Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing... |
|
Emerging |
| 743 |
joelpurra/talkie
Text-to-speech browser extension button. Select text on any web page, and... |
|
Emerging |
| 744 |
Quantatirsk/funasr-api
Speech recognition API service powered by FunASR and Qwen-ASR, supporting 52... |
|
Emerging |
| 745 |
yanorei32/discord-tts
TTS Discord Bot [VOICEROID, VOICEVOX, AivisSpeech, kttsproject, WinRT, and... |
|
Emerging |
| 746 |
SARIT42/lipsyncr
LipSyncr is a lip reading web app based on the LipNet model that can lip... |
|
Emerging |
| 747 |
arihanv/Shush
Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on... |
|
Emerging |
| 748 |
mark-rez/TikTok-Voice-TTS
Simple Python script to interact with the TikTok TTS Voices. |
|
Emerging |
| 749 |
eel-brah/kokorodoki
Natural-sounding Text-to-Speech App that fits anywhere. Fast, Real-Time and flexible. |
|
Emerging |
| 750 |
AmphionTeam/FlexiCodec
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates |
|
Emerging |
| 751 |
deepgram-starters/flask-transcription
Get started using Deepgram's Pre-Recorded Transcription with this Flask demo app |
|
Emerging |
| 752 |
DePasqualeOrg/mlx-swift-audio
Swift tools for text to speech (TTS) and speech to text (STT) powered by MLX |
|
Emerging |
| 753 |
kkoutini/PaSST
Efficient Training of Audio Transformers with Patchout |
|
Emerging |
| 754 |
eigenpunk/ComfyUI-audio
some generative audio tools for ComfyUI |
|
Emerging |
| 755 |
npuichigo/waveglow
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network... |
|
Emerging |
| 756 |
d4n3436/Fergun
A utility Discord bot written in C# using Discord.Net |
|
Emerging |
| 757 |
FireRedTeam/FireRedASR2S
A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc... |
|
Emerging |
| 758 |
pinguy/kokoro-tts-addon
Local neural TTS for Browsers: fast, expressive, and offline—runs on modest hardware. |
|
Emerging |
| 759 |
VideotronicMaker/LM-Studio-Voice-Conversation
Python app for LM Studio-enhanced voice conversations with local LLMs. Uses... |
|
Emerging |
| 760 |
eduardolat/kokoro-web
🔊 Kokoro Web: Free AI text-to-speech, online or self-hosted, OpenAI compatible! |
|
Emerging |
| 761 |
ai-adv-lab/deepspeech.mxnet
A MXNet implementation of Baidu's DeepSpeech architecture |
|
Emerging |
| 762 |
deterministic-algorithms-lab/Cross-Lingual-Voice-Cloning
Tacotron 2 - PyTorch implementation with faster-than-realtime inference... |
|
Emerging |
| 763 |
pritishyuvraj/Voice-Conversion-GAN
Voice Conversion using Cycle GAN's For Non-Parallel Data |
|
Emerging |
| 764 |
sipeed/Maix-Speech
Maix Speech AI lib, a fast and small speech lib running on embedded devices,... |
|
Emerging |
| 765 |
halfzm/v2vt
video to video translation with voice clone and lip... |
|
Emerging |
| 766 |
phatjkk/SpeakIt_Vietnamese_TTS
Vietnamese Text-to-Speech on Windows Project (zalo-speech) |
|
Emerging |
| 767 |
cvqluu/simple_diarizer
Simplified diarization pipeline using some pretrained models - audio file to... |
|
Emerging |
| 768 |
cpfair/quran-align
Word-accurate timestamps for Qur'anic audio. |
|
Emerging |
| 769 |
smeetrs/deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper. |
|
Emerging |
| 770 |
Open-Speech-EkStep/vakyansh-models
Open source speech to text models for Indic Languages |
|
Emerging |
| 771 |
d4n3436/GTranslate
A collection of free translation APIs (Google Translate, Bing Translator,... |
|
Emerging |
| 772 |
ayutaz/piper-plus
Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT) with VITS... |
|
Emerging |
| 773 |
gtreshchev/RuntimeSpeechRecognizer
Cross-platform, real-time, offline speech recognition plugin for Unreal... |
|
Emerging |
| 774 |
HumeAI/hume-react-sdk
Packages for using Hume AI and React |
|
Emerging |
| 775 |
JoelShine/JARVIS-AI-ASSISTANT
A true Artificial Intelligent Assistant with ALICE as backend and offline... |
|
Emerging |
| 776 |
goxr3plus/java-google-speech-api
🙊 Speech Recognition , Text To Speech , Google Translate |
|
Emerging |
| 777 |
tarun7r/Vocal-Agent
Cascading voice assistant combining real-time speech recognition, AI... |
|
Emerging |
| 778 |
HardCodeDev777/UnityNeuroSpeech
The world’s first game framework that lets you talk to AI in real time —... |
|
Emerging |
| 779 |
keonlee9420/Expressive-FastSpeech2
PyTorch Implementation of Non-autoregressive Expressive (emotional,... |
|
Emerging |
| 780 |
joethei/obsidian-tts
Text to speech for Obsidian. Hear your notes. |
|
Emerging |
| 781 |
patrickenfuego/Chapterize-Audiobooks
Split a single, monolithic mp3 audiobook file into chapters using Machine... |
|
Emerging |
| 782 |
ide8/tacotron2
Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow |
|
Emerging |
| 783 |
atomicoo/FCH-TTS
A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese,... |
|
Emerging |
| 784 |
Open-Speech-EkStep/vakyansh-wav2vec2-experimentation
Repository containing experimentation platform on how to train, infer on... |
|
Emerging |
| 785 |
tover0314-w/opentypeless
Talkmore with Opentypeless. Type with your voice. Anywhere. Talk -... |
|
Emerging |
| 786 |
themanyone/voice_typing
State-of-the-art offline (or networked) voice typing everywhere + text... |
|
Emerging |
| 787 |
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies |
|
Emerging |
| 788 |
George0828Zhang/torch_cif
A fast parallel PyTorch implementation of the "CIF: Continuous... |
|
Emerging |
| 789 |
PaddlePaddle/Parakeet
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer... |
|
Emerging |
| 790 |
ishandutta2007/Awesome-Text-to-Speech
🎤 A curated list of the latest and most influential tools, models, and... |
|
Emerging |
| 791 |
oliverguhr/wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model. |
|
Emerging |
| 792 |
AdroitAnandAI/Indian-Accent-Speech-Recognition
Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models... |
|
Emerging |
| 793 |
yl4579/StyleTTS
Official Implementation of StyleTTS |
|
Emerging |
| 794 |
BolajiAyodeji/chat-with-siri
🤖 A text-to-speech chatbot built using Nextjs, OpenAI, and ElevenLabs. |
|
Emerging |
| 795 |
HadrienGardeur/web-speech-recommended-voices
A list of recommended voices for the Web Speech API |
|
Emerging |
| 796 |
undertheseanlp/automatic_speech_recognition
Vietnamese Automatic Speech Recognition |
|
Emerging |
| 797 |
Chris10M/Lip2Speech
A pipeline to read lips and generate speech for the read content, i.e Lip to... |
|
Emerging |
| 798 |
alexram1313/text-to-speech-sample
Python3 Text to Speech Video Sample |
|
Emerging |
| 799 |
libdriver/ld3320
LD3320 full-featured driver library for general-purpose MCU and Linux. |
|
Emerging |
| 800 |
HKoon/ChatTTS-OpenVoice
Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your... |
|
Emerging |