All Voice AI Tools
6,981 tools ranked by quality score · Page 15 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 1401 |
xenova/kokoro-web
ML-powered speech synthesis directly in your browser |
|
Emerging |
| 1402 |
IhorShevchuk/RHVoice-spm
A free and open source speech synthesizer with support for a lot languages... |
|
Emerging |
| 1403 |
zakuro-ai/asr
ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in... |
|
Emerging |
| 1404 |
kgnlp/allophant
A multilingual phoneme recognizer capable of generalizing zero-shot to... |
|
Emerging |
| 1405 |
bedriyan/speaky
Voice-to-text for macOS, powered by on-device AI. Press a hotkey, speak, and... |
|
Emerging |
| 1406 |
roboticslab-uc3m/speech
Text To Speech (TTS) and Automatic Speech Recognition (ASR). |
|
Emerging |
| 1407 |
mrtrizer/UnityPiper
Offline text to speech inside Unity |
|
Emerging |
| 1408 |
aqiu202/aqiu-spring-boot-starter-projects
个人封装的一些开箱即用的Spring Boot Starter组件,简单且实用,后续会根据需求进行持续扩展! |
|
Emerging |
| 1409 |
pnlpal/pnl-reader
PNL Reader: read quietly or read aloud |
|
Emerging |
| 1410 |
chrisvdev/obs-chat
Also known as CVTalk is a Twitch chat viewer made with React for use in OBS... |
|
Emerging |
| 1411 |
liuhaozhe6788/voice-cloning-collab
an improved version of Real-time-voice-cloning |
|
Emerging |
| 1412 |
hiteshsahu/Android-TTS-STT
One line solution for Android Text to speech(TTS) & Speech to Text(STT)... |
|
Emerging |
| 1413 |
supikiti/PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC |
|
Emerging |
| 1414 |
OwenTyme/voice-zero
Collection of samples suitable for use with zero-shot text to speech engines. |
|
Emerging |
| 1415 |
arghyasur1991/LiveTalk-Unity
LiveTalk is a unified, high-performance talking head generation system that... |
|
Emerging |
| 1416 |
DrewThomasson/ebook2audiobookSTYLETTS2
This simple program makes use of Calibre to convert a ebook into chapters... |
|
Emerging |
| 1417 |
mayeaux/generate-subtitles
Generate transcripts for audio and video content with a user friendly UI,... |
|
Emerging |
| 1418 |
nihui/ncnn-android-piper
ncnn android piper the fast and local neural text-to-speech engine |
|
Emerging |
| 1419 |
sooftware/speech-transformer
Transformer implementation speciaized in speech recognition tasks using Pytorch. |
|
Emerging |
| 1420 |
cool-japan/voirs
VoiRS is a cutting-edge Text-to-Speech (TTS), Voice Recognition, Sound... |
|
Emerging |
| 1421 |
hubendubler/gTTS.js
A Promise based Node.js/TypeScript port of the gTTS Google-Text-To-Speech... |
|
Emerging |
| 1422 |
gmltmd789/UnitSpeech
An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis... |
|
Emerging |
| 1423 |
ShawnHymel/tflite-speech-recognition
Demo for training a convolutional neural network to classify words and... |
|
Emerging |
| 1424 |
sldimitrov/english_learning_system
English Learning System I have developed in order to help others in... |
|
Emerging |
| 1425 |
zthxxx/python-Speech_Recognition
A simple example for use speech recognition baidu api with python. |
|
Emerging |
| 1426 |
rishikksh20/SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation |
|
Emerging |
| 1427 |
ReneTode/My-AppDaemon
My apps, my helpfiles, all about AppDaemon for Home Assistant |
|
Emerging |
| 1428 |
darkautism/sensevoice-rs
A Rust-based, SenseVoiceSmall |
|
Emerging |
| 1429 |
spokestack/spokestack-python
Spokestack is a library that allows a user to easily incorporate a voice... |
|
Emerging |
| 1430 |
tabahi/contexless-phonemes-CUPE
pytorch model for contexless-phoneme prediction from speech audio |
|
Emerging |
| 1431 |
alessandroragano/scoreq
SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024) |
|
Emerging |
| 1432 |
cameronking4/openai-realtime-blocks
Voice AI components using OpenAI Realtime API to copy and paste into your... |
|
Emerging |
| 1433 |
OlivierMary/MySuperWhisper
A global voice dictation tool for Linux using local OpenAI Whisper. Fast,... |
|
Emerging |
| 1434 |
flogy/gatsby-mdx-tts
🗣 Adds speech output to your Gatsby site using Amazon Polly. |
|
Emerging |
| 1435 |
Aivis-Project/aivmlib-web
Aivis Voice Model File (.aivm/.aivmx) Utility Library for Web |
|
Emerging |
| 1436 |
nodef/extra-googletts
Generate speech audio from super long text through machine (via "Google... |
|
Emerging |
| 1437 |
black-roland/homeassistant-yandex-speechkit
Yandex SpeechKit integration for Home Assistant providing speech-to-text and... |
|
Emerging |
| 1438 |
espnet/interspeech2019-tutorial
INTERSPEECH 2019 Tutorial Materials |
|
Emerging |
| 1439 |
balisujohn/tortoise.cpp
A ggml (C++) re-implementation of tortoise-tts |
|
Emerging |
| 1440 |
hans00/phonemize
Pure JS fast phonemizer with rule-based G2P prediction |
|
Emerging |
| 1441 |
smartherd/SpeechToText
Speech To Text in Android |
|
Emerging |
| 1442 |
playht/text-to-speech-api
Play.ht's Text to Speech API |
|
Emerging |
| 1443 |
npuichigo/voicenet
Speech synthesis platform based on tensorflow and sonnet |
|
Emerging |
| 1444 |
6drf21e/ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。 |
|
Emerging |
| 1445 |
sovse/Rus-SpeechRecognition-LSTM-CTC-VoxForge
Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge |
|
Emerging |
| 1446 |
SergeyShk/Speech-to-Text-Russian
Проект для распознавания речи на русском языке на основе pykaldi. |
|
Emerging |
| 1447 |
keonlee9420/DailyTalk
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational... |
|
Emerging |
| 1448 |
MiniMax-AI/MiniMax-AI.github.io
The official GitHub Page for MiniMax |
|
Emerging |
| 1449 |
jorge-menjivar/super-stt
Super STT enables effortless voice-to-text in any application, using the... |
|
Emerging |
| 1450 |
CodeBySonu95/VoxSherpa-TTS
🎙️ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android ⚡... |
|
Emerging |
| 1451 |
sevangelatos/py-ttspico
Python svox picotts wrapper |
|
Emerging |
| 1452 |
ioBroker/ioBroker.sonus
Control ioBroker with voice |
|
Emerging |
| 1453 |
HarunoriKawano/Wav2vec2.0
Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised... |
|
Emerging |
| 1454 |
nature-heart-software/izabela
Your speech assistant. Communicate with text-to-speech in games, on voice... |
|
Emerging |
| 1455 |
LuckyHookin/edge-TTS-record
一个可以录制 Microsoft Edge 浏览器的语音合成(TTS)语音并输出为 .wav 音频的(windows平台)工具。 |
|
Emerging |
| 1456 |
tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper |
|
Emerging |
| 1457 |
orange2ai/youtube-subtitle-translator
🌐 Real-time YouTube subtitle translator browser extension. Translate... |
|
Emerging |
| 1458 |
TheNewC0der-24/Textonus
Voice to Text Online Notepad Professional, Accurate & Free Speech... |
|
Emerging |
| 1459 |
gooofy/py-picotts
Python wrappers around SVOX Pico TTS |
|
Emerging |
| 1460 |
1ytic/open_stt_e2e
PyTorch end-to-end speech recognition |
|
Emerging |
| 1461 |
LonePheasantWarrior/TalkifyTTS
云端大模型驱动的 Android 语音合成应用(TTS引擎)。支持豆包、腾讯、微软、千问等模型。An Android text-to-speech... |
|
Emerging |
| 1462 |
harvard-edge/multilingual_kws
Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus |
|
Emerging |
| 1463 |
MuGuiLin/VoiceDictation
迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。 |
|
Emerging |
| 1464 |
JusperLee/Conv-TasNet
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech... |
|
Emerging |
| 1465 |
KernelInterrupt/whisper4dart
whisper4dart is a dart wrapper for whisper.cpp, designed to offer an... |
|
Emerging |
| 1466 |
bold-ronin/lira
A Voice-First AI Companion |
|
Emerging |
| 1467 |
jhubbardsf/svelte-speech-recognition
Speech recognition library for Svelte |
|
Emerging |
| 1468 |
dictate-button/dictate-button
Customizable Web Component that adds speech-to-text dictation capabilities... |
|
Emerging |
| 1469 |
spokestack/spokestack-android
Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS.... |
|
Emerging |
| 1470 |
VITA-Group/Audio-Lottery
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight,... |
|
Emerging |
| 1471 |
aiola-lab/drax
Drax: Speech Recognition with Discrete Flow Matching |
|
Emerging |
| 1472 |
GuillaumeFalourd/formulas-python
Ritchie CLI formulas in Python 🐍 |
|
Emerging |
| 1473 |
1038lab/ComfyUI-MegaTTS
A ComfyUI custom node based on ByteDance MegaTTS3, enabling high-quality... |
|
Emerging |
| 1474 |
bytectlgo/edge-tts
Edge TTS is a command-line tool based on Microsoft Edge's text-to-speech... |
|
Emerging |
| 1475 |
evilC/HotVoice
Adds Speech Recognition support to AutoHotkey, via a C# DLL |
|
Emerging |
| 1476 |
alan-ai/alan-sdk-reactnative
The Self-Coding System for Your App — Alan AI SDK for React Native |
|
Emerging |
| 1477 |
jamditis/audiobash
Voice-controlled terminal for developers. Speak commands, execute instantly. |
|
Emerging |
| 1478 |
khanld/ASR-Wav2vec-Finetune
:zap: Finetune Wa2vec 2.0 For Speech Recognition |
|
Emerging |
| 1479 |
speechsuper/SpeechSuper-API-Samples
Deep learning based speech and pronunciation assessment API for 8 languages. |
|
Emerging |
| 1480 |
jordicor/santa-claus-is-calling
A magical Christmas experience where Santa Claus (AI with Santa's voice)... |
|
Emerging |
| 1481 |
QuantiusBenignus/blurt
Gnome shell extension for accurate OFFLINE speech to text input in Linux... |
|
Emerging |
| 1482 |
xingchensong/Speech-Transformer-tf2.0
transformer for ASR-systerm (via tensorflow2.0) |
|
Emerging |
| 1483 |
MarkParker5/STARK
S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit |
|
Emerging |
| 1484 |
andi611/ZeroSpeech-TTS-without-T
A Pytorch implementation for the ZeroSpeech 2019 challenge. |
|
Emerging |
| 1485 |
openconcerto/MisterWhisper
Push to talk voice recognition using Whisper |
|
Emerging |
| 1486 |
piotrkawa/deepfake-whisper-features
Implementation of the paper "Improved DeepFake Detection Using Whisper Features" |
|
Emerging |
| 1487 |
askrella/speech-rest-api
Transcription and TTS Rest API (OpenAI Whisper, Speechbrain) |
|
Emerging |
| 1488 |
AA-Factory/aafactory-prototype
⚡ AI Avatar Factory is an interface for creating and managing AI avatars. ⚡ |
|
Emerging |
| 1489 |
iceychris/LibreASR
:speech_balloon: An On-Premises, Streaming Speech Recognition System |
|
Emerging |
| 1490 |
kaushiknishchay/ComfyUI-Qwen3-ASR
ComfyUI nodes for Qwen3-ASR (0.6B/1.7B) and ForcedAligner. Supports... |
|
Emerging |
| 1491 |
rohit-lakhanpal/ai-hackathon-starter-kit
This project has been created to make AI accessible and easy for everyone.... |
|
Emerging |
| 1492 |
twangodev/speak-mintlify
Automatically generate voice narration for your Mintlify documentation. |
|
Emerging |
| 1493 |
SpeechColab/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform... |
|
Emerging |
| 1494 |
gorkemkaramolla/whisper-run
Faster Whisper with Speaker Diarization |
|
Emerging |
| 1495 |
Saganaki22/ComfyUI-KittenTTS
😻 A simple ComfyUI custom node for KittenTTS - an ultra-lightweight... |
|
Emerging |
| 1496 |
botbahlul/crx-live-translate
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video... |
|
Emerging |
| 1497 |
ElmTran/praises
Praises is a text-to-speech tool that can help you read text easily. |
|
Emerging |
| 1498 |
AkishinoShiame/Chinese-Speech-Emotion-Datasets
Datasets of A Deep Convolutional Neural Network Based Virtual Elderly... |
|
Emerging |
| 1499 |
hcy71o/SC-CNN
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker... |
|
Emerging |
| 1500 |
jingangdidi/voice_clone
An OpenVoice-based voice cloning tool, single executable file (~14M),... |
|
Emerging |