All Voice AI Tools

6,983 tools ranked by quality score

Showing 1–100 of 6,983
# Tool Score Tier
1 espnet/espnet

End-to-End Speech Processing Toolkit

96
Verified
2 TalAter/annyang

💬 Speech recognition for your site

93
Verified
3 Blaizzy/mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS)...

93
Verified
4 elevenlabs/elevenlabs-python

The official Python SDK for the ElevenLabs API.

92
Verified
5 k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement,...

91
Verified
6 Uberi/speech_recognition

Speech recognition module for Python, supporting several engines and APIs,...

90
Verified
7 m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

90
Verified
8 jdepoix/youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a...

86
Verified
9 DrewThomasson/ebook2audiobook

Generate audiobooks from e-books, voice cloning & 1158+ languages!

84
Verified
10 KoljaB/RealtimeTTS

Converts text to speech in realtime

84
Verified
11 cmusphinx/pocketsphinx

A small speech recognizer

84
Verified
12 PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model,...

82
Verified
13 alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers...

81
Verified
14 OpenBMB/VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and...

81
Verified
15 pndurette/gTTS

Python library and CLI tool to interface with Google Translate's text-to-speech API

78
Verified
16 rany2/edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT...

76
Verified
17 nateshmbhat/pyttsx3

Offline Text To Speech synthesis for python

75
Verified
18 denizsafak/abogen

Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

75
Verified
19 gradio-app/fastrtc

The python library for real-time communication

75
Verified
20 salute-developers/GigaAM

Foundational Model for Speech Recognition Tasks

74
Verified
21 espeak-ng/espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than...

73
Verified
22 ggml-org/whisper.cpp

Port of OpenAI's Whisper model in C/C++

72
Verified
23 huggingface/speech-to-speech

Build local voice agents with open-source models

72
Verified
24 descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz,...

72
Verified
25 supertone-inc/supertonic

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

71
Verified
26 Picovoice/porcupine

On-device wake word detection powered by deep learning

70
Verified
27 jianchang512/pyvideotrans

Translate the video from one language to another and embed dubbing & subtitles.

70
Verified
28 thewh1teagle/kokoro-onnx

TTS with kokoro and onnx runtime

70
Verified
29 santinic/audiblez

Generate audiobooks from e-books

70
Verified
30 readest/readest

Readest is a modern, feature-rich ebook reader designed for avid readers...

69
Established
31 livekit/livekit

End-to-end realtime stack for connecting humans and AI

69
Established
32 IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

69
Established
33 speechmatics/speechmatics-python

Python library and CLI for Speechmatics

69
Established
34 rapidaai/voice-ai

Rapida is an open-source, end-to-end voice AI orchestration platform for...

69
Established
35 pnnbao97/VieNeu-TTS

Vietnamese TTS with instant voice cloning • On-device • Real-time CPU...

69
Established
36 coqui-ai/TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research...

69
Established
37 fishaudio/fish-speech

SOTA Open Source TTS

68
Established
38 linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

68
Established
39 collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

68
Established
40 foyoux/pygtrans

谷歌翻译, 支持 APIKEY 一口气翻译十万条

67
Established
41 jamiepine/voicebox

The open-source voice synthesis studio

67
Established
42 compulim/web-speech-cognitive-services

Polyfill Web Speech API with Cognitive Services for both speech-to-text and...

67
Established
43 Softcatala/whisper-ctranslate2

Whisper command line client compatible with original OpenAI client based on...

67
Established
44 mozilla-ai/document-to-podcast

Blueprint by Mozilla.ai for generating podcasts from documents using local AI

66
Established
45 istupakov/onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

66
Established
46 kxxt/aspeak

A simple text-to-speech client for Azure TTS API.

66
Established
47 ccoreilly/vosk-browser

A speech recognition library running in the browser thanks to a WebAssembly...

66
Established
48 met4citizen/TalkingHead

Talking Head (3D): A JavaScript class for real-time lip-sync using full-body...

66
Established
49 TensorSpeech/TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art...

66
Established
50 playht/pyht

PlayHT Python SDK - AI Text-to-Speech Streaming & Voice Cloning API

65
Established
51 FluidInference/FluidAudio

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text,...

65
Established
52 SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

65
Established
53 CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

65
Established
54 devnen/Chatterbox-TTS-Server

Self-host the powerful Chatterbox TTS model. This server offers a...

64
Established
55 fishaudio/Bert-VITS2

vits2 backbone with multilingual-bert

64
Established
56 snakers4/silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

64
Established
57 ChetanXpro/nodejs-whisper

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as...

64
Established
58 k2-fsa/sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using...

64
Established
59 FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training...

64
Established
60 Rei-x/discord-speech-recognition

Speech to text extension for discord.js

64
Established
61 nazdridoy/kokoro-tts

A CLI text-to-speech tool using the Kokoro model, supporting multiple...

63
Established
62 herimor/voxtream

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and...

63
Established
63 lucidrains/HS-TasNet

Implementation of HS-TasNet, "Real-time Low-latency Music Source Separation...

63
Established
64 travisvn/chatterbox-tts-api

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling...

63
Established
65 fgnt/meeteval

MeetEval - A meeting transcription evaluation toolkit

63
Established
66 Picovoice/web-voice-processor

A library for real-time voice processing in web browsers

63
Established
67 index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

63
Established
68 yeyupiaoling/MASR

Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2...

63
Established
69 rsxdalv/TTS-WebUI

A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio,...

63
Established
70 mbailey/voicemode

Natural (2-way) voice conversations with Claude Code

63
Established
71 FelippeChemello/podcast-maker

Fully automated video maker using motion graphics and text-to-speech...

63
Established
72 readbeyond/aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize...

63
Established
73 analyticsinmotion/werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error...

63
Established
74 yeyupiaoling/PPASR

基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Confor...

63
Established
75 daswer123/xtts-api-server

A simple FastAPI Server to run XTTSv2

63
Established
76 jatinkrmalik/vocalinux

Free, open-source, 100% offline voice dictation for Linux. Speak and type...

63
Established
77 meizhong986/WhisperJAV

ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD....

63
Established
78 EDCD/EDDI

Companion application for Elite Dangerous

62
Established
79 tensorflow/lingvo

Lingvo

62
Established
80 khanld/chunkformer

ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

62
Established
81 shibing624/parrots

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine....

62
Established
82 tsmdt/whisply

💬 Fast, cross-platform CLI and GUI for batch transcription, translation,...

62
Established
83 Ailln/cn2an

📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)

62
Established
84 thewh1teagle/sherpa-rs

Rust bindings to https://github.com/k2-fsa/sherpa-onnx

62
Established
85 kahne/fastwer

A PyPI package for fast word/character error rate (WER/CER) calculation

62
Established
86 TensorSpeech/TensorFlowASR

:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in...

62
Established
87 thewh1teagle/phonikud

Hebrew grapheme to phoneme (G2P)

62
Established
88 k2-fsa/sherpa

Speech-to-text server framework with next-gen Kaldi

62
Established
89 diodiogod/TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language...

62
Established
90 modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA...

62
Established
91 speechbrain/speechbrain

A PyTorch-based Speech Toolkit

61
Established
92 lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model,...

61
Established
93 RHVoice/RHVoice

a free and open source speech synthesizer for Russian and other languages

61
Established
94 alphacep/vosk

VOSK Speech Recognition Toolkit

61
Established
95 daanzu/kaldi-active-grammar

Python Kaldi speech recognition with grammars that can be set...

61
Established
96 morganney/tts-react

Convert text to speech using React.

61
Established
97 openctp/openctp

openctp提供CTP股票期权、中泰证券XTP、华鑫证券奇点TORA、东方证券OST、东方财富证券EMT、盈透证券TWS、易盛TAP、量投QDP等各通道...

61
Established
98 argmaxinc/WhisperKit

On-device Speech Recognition for Apple Silicon

61
Established
99 EDDiscovery/EDDiscovery

Captains log and 3d star map for Elite Dangerous

61
Established
100 pion/mediadevices

Go implementation of the MediaDevices API.

60
Established
1 2 3 68 69 70 Next »