All Voice AI Tools
6,983 tools ranked by quality score
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
espnet/espnet
End-to-End Speech Processing Toolkit |
|
Verified |
| 2 |
TalAter/annyang
💬 Speech recognition for your site |
|
Verified |
| 3 |
Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS)... |
|
Verified |
| 4 |
elevenlabs/elevenlabs-python
The official Python SDK for the ElevenLabs API. |
|
Verified |
| 5 |
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, speech enhancement,... |
|
Verified |
| 6 |
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs,... |
|
Verified |
| 7 |
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization) |
|
Verified |
| 8 |
jdepoix/youtube-transcript-api
This is a python API which allows you to get the transcript/subtitles for a... |
|
Verified |
| 9 |
DrewThomasson/ebook2audiobook
Generate audiobooks from e-books, voice cloning & 1158+ languages! |
|
Verified |
| 10 |
KoljaB/RealtimeTTS
Converts text to speech in realtime |
|
Verified |
| 11 |
cmusphinx/pocketsphinx
A small speech recognizer |
|
Verified |
| 12 |
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model,... |
|
Verified |
| 13 |
alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers... |
|
Verified |
| 14 |
OpenBMB/VoxCPM
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and... |
|
Verified |
| 15 |
pndurette/gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API |
|
Verified |
| 16 |
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT... |
|
Verified |
| 17 |
nateshmbhat/pyttsx3
Offline Text To Speech synthesis for python |
|
Verified |
| 18 |
denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions. |
|
Verified |
| 19 |
gradio-app/fastrtc
The python library for real-time communication |
|
Verified |
| 20 |
salute-developers/GigaAM
Foundational Model for Speech Recognition Tasks |
|
Verified |
| 21 |
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than... |
|
Verified |
| 22 |
ggml-org/whisper.cpp
Port of OpenAI's Whisper model in C/C++ |
|
Verified |
| 23 |
huggingface/speech-to-speech
Build local voice agents with open-source models |
|
Verified |
| 24 |
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz,... |
|
Verified |
| 25 |
supertone-inc/supertonic
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX. |
|
Verified |
| 26 |
Picovoice/porcupine
On-device wake word detection powered by deep learning |
|
Verified |
| 27 |
jianchang512/pyvideotrans
Translate the video from one language to another and embed dubbing & subtitles. |
|
Verified |
| 28 |
thewh1teagle/kokoro-onnx
TTS with kokoro and onnx runtime |
|
Verified |
| 29 |
santinic/audiblez
Generate audiobooks from e-books |
|
Verified |
| 30 |
readest/readest
Readest is a modern, feature-rich ebook reader designed for avid readers... |
|
Established |
| 31 |
livekit/livekit
End-to-end realtime stack for connecting humans and AI |
|
Established |
| 32 |
IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance. |
|
Established |
| 33 |
speechmatics/speechmatics-python
Python library and CLI for Speechmatics |
|
Established |
| 34 |
rapidaai/voice-ai
Rapida is an open-source, end-to-end voice AI orchestration platform for... |
|
Established |
| 35 |
pnnbao97/VieNeu-TTS
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU... |
|
Established |
| 36 |
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research... |
|
Established |
| 37 |
fishaudio/fish-speech
SOTA Open Source TTS |
|
Established |
| 38 |
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence |
|
Established |
| 39 |
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper. |
|
Established |
| 40 |
foyoux/pygtrans
谷歌翻译, 支持 APIKEY 一口气翻译十万条 |
|
Established |
| 41 |
jamiepine/voicebox
The open-source voice synthesis studio |
|
Established |
| 42 |
compulim/web-speech-cognitive-services
Polyfill Web Speech API with Cognitive Services for both speech-to-text and... |
|
Established |
| 43 |
Softcatala/whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on... |
|
Established |
| 44 |
mozilla-ai/document-to-podcast
Blueprint by Mozilla.ai for generating podcasts from documents using local AI |
|
Established |
| 45 |
istupakov/onnx-asr
A lightweight Python package for Automatic Speech Recognition using ONNX models |
|
Established |
| 46 |
kxxt/aspeak
A simple text-to-speech client for Azure TTS API. |
|
Established |
| 47 |
ccoreilly/vosk-browser
A speech recognition library running in the browser thanks to a WebAssembly... |
|
Established |
| 48 |
met4citizen/TalkingHead
Talking Head (3D): A JavaScript class for real-time lip-sync using full-body... |
|
Established |
| 49 |
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art... |
|
Established |
| 50 |
playht/pyht
PlayHT Python SDK - AI Text-to-Speech Streaming & Voice Cloning API |
|
Established |
| 51 |
FluidInference/FluidAudio
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text,... |
|
Established |
| 52 |
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2 |
|
Established |
| 53 |
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time |
|
Established |
| 54 |
devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a... |
|
Established |
| 55 |
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert |
|
Established |
| 56 |
snakers4/silero-models
Silero Models: pre-trained text-to-speech models made embarrassingly simple |
|
Established |
| 57 |
ChetanXpro/nodejs-whisper
NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as... |
|
Established |
| 58 |
k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using... |
|
Established |
| 59 |
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training... |
|
Established |
| 60 |
Rei-x/discord-speech-recognition
Speech to text extension for discord.js |
|
Established |
| 61 |
nazdridoy/kokoro-tts
A CLI text-to-speech tool using the Kokoro model, supporting multiple... |
|
Established |
| 62 |
herimor/voxtream
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and... |
|
Established |
| 63 |
lucidrains/HS-TasNet
Implementation of HS-TasNet, "Real-time Low-latency Music Source Separation... |
|
Established |
| 64 |
travisvn/chatterbox-tts-api
Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling... |
|
Established |
| 65 |
fgnt/meeteval
MeetEval - A meeting transcription evaluation toolkit |
|
Established |
| 66 |
Picovoice/web-voice-processor
A library for real-time voice processing in web browsers |
|
Established |
| 67 |
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System |
|
Established |
| 68 |
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2... |
|
Established |
| 69 |
rsxdalv/TTS-WebUI
A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio,... |
|
Established |
| 70 |
mbailey/voicemode
Natural (2-way) voice conversations with Claude Code |
|
Established |
| 71 |
FelippeChemello/podcast-maker
Fully automated video maker using motion graphics and text-to-speech... |
|
Established |
| 72 |
readbeyond/aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize... |
|
Established |
| 73 |
analyticsinmotion/werpy
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error... |
|
Established |
| 74 |
yeyupiaoling/PPASR
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Confor... |
|
Established |
| 75 |
daswer123/xtts-api-server
A simple FastAPI Server to run XTTSv2 |
|
Established |
| 76 |
jatinkrmalik/vocalinux
Free, open-source, 100% offline voice dictation for Linux. Speak and type... |
|
Established |
| 77 |
meizhong986/WhisperJAV
ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD.... |
|
Established |
| 78 |
EDCD/EDDI
Companion application for Elite Dangerous |
|
Established |
| 79 |
tensorflow/lingvo
Lingvo |
|
Established |
| 80 |
khanld/chunkformer
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription |
|
Established |
| 81 |
shibing624/parrots
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine.... |
|
Established |
| 82 |
tsmdt/whisply
💬 Fast, cross-platform CLI and GUI for batch transcription, translation,... |
|
Established |
| 83 |
Ailln/cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化) |
|
Established |
| 84 |
thewh1teagle/sherpa-rs
Rust bindings to https://github.com/k2-fsa/sherpa-onnx |
|
Established |
| 85 |
kahne/fastwer
A PyPI package for fast word/character error rate (WER/CER) calculation |
|
Established |
| 86 |
TensorSpeech/TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in... |
|
Established |
| 87 |
thewh1teagle/phonikud
Hebrew grapheme to phoneme (G2P) |
|
Established |
| 88 |
k2-fsa/sherpa
Speech-to-text server framework with next-gen Kaldi |
|
Established |
| 89 |
diodiogod/TTS-Audio-Suite
A ComfyUI custom node integration for multi-engine multi-language... |
|
Established |
| 90 |
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA... |
|
Established |
| 91 |
speechbrain/speechbrain
A PyTorch-based Speech Toolkit |
|
Established |
| 92 |
lenML/Speech-AI-Forge
🍦 Speech-AI-Forge is a project developed around TTS generation model,... |
|
Established |
| 93 |
RHVoice/RHVoice
a free and open source speech synthesizer for Russian and other languages |
|
Established |
| 94 |
alphacep/vosk
VOSK Speech Recognition Toolkit |
|
Established |
| 95 |
daanzu/kaldi-active-grammar
Python Kaldi speech recognition with grammars that can be set... |
|
Established |
| 96 |
morganney/tts-react
Convert text to speech using React. |
|
Established |
| 97 |
openctp/openctp
openctp提供CTP股票期权、中泰证券XTP、华鑫证券奇点TORA、东方证券OST、东方财富证券EMT、盈透证券TWS、易盛TAP、量投QDP等各通道... |
|
Established |
| 98 |
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon |
|
Established |
| 99 |
EDDiscovery/EDDiscovery
Captains log and 3d star map for Elite Dangerous |
|
Established |
| 100 |
pion/mediadevices
Go implementation of the MediaDevices API. |
|
Established |