All Voice AI Tools
6,981 tools ranked by quality score · Page 42 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 4101 |
prokhororlov/VoiceCraft
Book to MP3 converter. Convert e-books (FB2, EPUB, TXT) to MP3 audiobooks... |
|
Experimental |
| 4102 |
aiola-lab/aiola-js-sdk
The official JavaScript/TypeScript SDK for the aiOla API |
|
Experimental |
| 4103 |
naver/multilingual-distilwhisper
This repository contains all the code necessary for running the multilingual... |
|
Experimental |
| 4104 |
gongouveia/Whisper-Synthetic-ASR-Dataset-Generator
This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI... |
|
Experimental |
| 4105 |
dustland/talk
IELTS Speaking. |
|
Experimental |
| 4106 |
Revocalize/revocalize-python
The official Python API for Revocalize AI voice synthesizer platform. |
|
Experimental |
| 4107 |
sandeepmukku12/vocodine
🎙️ VocoDine: Book your table with your voice! Speak your booking details,... |
|
Experimental |
| 4108 |
SatyamPote/Ai-Video-Interviewer
An AI-powered mock interview platform that simulates a real-time video call... |
|
Experimental |
| 4109 |
dobby-seo/kosr
Korean speech recognition based on transformer (트랜스포머 기반 한국어 음성 인식) |
|
Experimental |
| 4110 |
1ytic/edit-distance-papers
A curated list of papers dedicated to edit-distance as objective function |
|
Experimental |
| 4111 |
Hariswar8018/Star-Wish-AI-Stories
Create Stories with AI, View Stories as well as Scan BarCode to known more... |
|
Experimental |
| 4112 |
Smorodov/kaldi_vosk_win_cmake
cmake based kaldi + vosk + microphone speech recognition example |
|
Experimental |
| 4113 |
abdufelsayed/talkio
Talkio — TypeScript voice AI orchestration: STT + LLM + TTS with streaming,... |
|
Experimental |
| 4114 |
VARCOVoice/VARCOVoice_UNITYSDK
Official Unity SDK for VARCO Voice API. High-quality AI text-to-speech,... |
|
Experimental |
| 4115 |
Geguchh024/VocalizeMD
A VS Code extension that converts Markdown files to natural-sounding speech... |
|
Experimental |
| 4116 |
wq2012/VB_diarization
VB Diarization with Eigenvoice and HMM Priors, refactored |
|
Experimental |
| 4117 |
partrita/tts-kokoro-app
local app for Kokoro TTS. |
|
Experimental |
| 4118 |
BluShooz/text-to-video-generator
SOTA Text-to-Video Generator with MuseTalk 1.5, LivePortrait, and LTX-Video.... |
|
Experimental |
| 4119 |
Kaljurand/Grammars
Grammatical Framework based speech recognition grammars for Estonian,... |
|
Experimental |
| 4120 |
FairyDevicesRD/mimi.client.kotlin
mimi(R) API Client for Kotlin |
|
Experimental |
| 4121 |
Yangyangii/Tacotron-pytorch
Tacotron implementation with pytorch 1.0 |
|
Experimental |
| 4122 |
mklement0/speak.awf
An Alfred 3 workflow that uses macOS's TTS (text-to-speech) feature to speak... |
|
Experimental |
| 4123 |
bitgineer/Speakeasy
Privacy-first local voice-to-text using Whisper AI. Cross-platform desktop... |
|
Experimental |
| 4124 |
gikonyob/speake3
Speake3 library provides a wrapper around Espeak to easily write efficient... |
|
Experimental |
| 4125 |
funnyzak/xfyun-nls
讯飞云智能语音处理 Node 模块。 |
|
Experimental |
| 4126 |
ntddk/transcibe
A script to transcribe audio files with Google Cloud Speech API. |
|
Experimental |
| 4127 |
NassimaOULDOUALI/Prosody-Control-French-TTS
An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control |
|
Experimental |
| 4128 |
WelkinYang/EMPHASIS-pytorch
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System |
|
Experimental |
| 4129 |
ORI-Muchim/Grad-TTS
'Grad-TTS' with Multilingual Cleaners |
|
Experimental |
| 4130 |
grossstadtmann/elevenbatch
Elevenlabs.io API batch creation of text to speach files. |
|
Experimental |
| 4131 |
cihanselim/python-codebyvoice
talk for programming :loudspeaker: /w google speech recognition |
|
Experimental |
| 4132 |
tltrogl/diaremot2-on
DiaRemot2-ON: CPU-only audio intelligence pipeline (Faster-Whisper, ONNX,... |
|
Experimental |
| 4133 |
m15-ai/TrooperAI
Conversational AI, local, low-latency voice assistant for Raspberry Pi 5... |
|
Experimental |
| 4134 |
babua/TTSDatasetRecorder
A simple app for recording speech datasets. |
|
Experimental |
| 4135 |
QuantiusBenignus/Spoken
Joplin text notes and to-dos via OFFLINE speech recognition. To-do reminders... |
|
Experimental |
| 4136 |
mozilla-ai/speech-to-text
Blueprint by Mozilla.ai on how to transcribe audio files |
|
Experimental |
| 4137 |
FluxCapacitor2/whisper-asr-webapp
A web app for automatic speech recognition using OpenAI's Whisper model... |
|
Experimental |
| 4138 |
jik876/hifi-gan-demo
Audio samples from "HiFi-GAN: Generative Adversarial Networks for Efficient... |
|
Experimental |
| 4139 |
dimitriStoidis/GenGAN
Repository for the paper: Generating gender-ambiguous voices for... |
|
Experimental |
| 4140 |
AndreaLombax/Speech_emotion_recognition
In this work is proposed a speech emotion recognition model based on the... |
|
Experimental |
| 4141 |
aliyzd95/modified-shemo
A modification on the Sharif Emotional Speech Database |
|
Experimental |
| 4142 |
PrashanthaTP/wav2mov
Speech to Facial Animation using GANs |
|
Experimental |
| 4143 |
Ashmit-Kumar/Assess-AI
End-to-end AI interview platform featuring live voice interaction, coding... |
|
Experimental |
| 4144 |
timf34/Article2Audio
Convert articles to audio using OpenAI's Text to Speech API via a python... |
|
Experimental |
| 4145 |
mvalancy/logitech_bcc950
A talking eyeball on a stick - Logitech BCC950 PTZ camera control scripts |
|
Experimental |
| 4146 |
Sharan-Kumar-R/Talk2Translate
The application uses SpeechRecognition, GoogleTranslator, and gTTS to... |
|
Experimental |
| 4147 |
taresh18/livekit-kokoro
Livekit TTS plugin for kokoro |
|
Experimental |
| 4148 |
transitive-bullshit/unrealspeech-api
TypeScript client for the Unreal Speech TTS API. |
|
Experimental |
| 4149 |
SpringerNLP/Chapter12
Chapter 12: End-to-end Speech Recognition |
|
Experimental |
| 4150 |
danvers/medienpaed-asr
Understanding ASR |
|
Experimental |
| 4151 |
RakeshBabuGajula/real-time-voice-translator
A real-time voice translator web app built with Streamlit that captures live... |
|
Experimental |
| 4152 |
jaju/voissistant
Voiss Aceistant - Apple only, with mlx. |
|
Experimental |
| 4153 |
PanosAntoniadis/slp-ntua
Lab exercises of Speech and Language Processing course in NTUA |
|
Experimental |
| 4154 |
microsoft/MunTTS-A-Text-to-Speech-System-For-Mundari
Official Codebase for "MunTTS: A Text-to-Speech System for Mundari"... |
|
Experimental |
| 4155 |
xulihang/Silhouette
An open source computer-aided translation tool for audios and videos |
|
Experimental |
| 4156 |
Mmesek/mUSh
Ultrastar Songs Creation/Management helper utils. |
|
Experimental |
| 4157 |
IAMJOYBO/index-tts
Docker镜像自动构建并上传到阿里云 |
|
Experimental |
| 4158 |
speaking-portal-project-team-a/The-Speaking-Portal-Project
The objective of the Speaking Portal Project is to design, develop, and... |
|
Experimental |
| 4159 |
burntcarrot/quackspeak
Text-to-speech using ducks. 🦆 |
|
Experimental |
| 4160 |
lifeCoder123/Speech-to-Text-Converter
Speech-to-text converter tool using Google Speech Cloud API to convert... |
|
Experimental |
| 4161 |
aminul-huq/Speech-Command-Classification
Speech command classification on Speech-Command v0.02 dataset using PyTorch... |
|
Experimental |
| 4162 |
narVidhai/Speech-Transcription-Benchmarking
Example python scripts to evaluate various ASR methods |
|
Experimental |
| 4163 |
malob/article-to-audio-cloud-function
Google Cloud Function that takes a url, converts the article at that url to... |
|
Experimental |
| 4164 |
KuchikiRenji/vall-e
Unofficial PyTorch implementation of VALL-E: zero-shot text-to-speech and... |
|
Experimental |
| 4165 |
popcornell/MicRank
MicRank is a Learning to Rank neural channel selection framework where a DNN... |
|
Experimental |
| 4166 |
anicolson/matlab_feat
Functions for creating speech features in MATLAB. |
|
Experimental |
| 4167 |
mvshyvk/KaldiService
Service for easy access to speech recognition capabilities of Kaldi using... |
|
Experimental |
| 4168 |
George0828Zhang/simulst
PyTorch toolkit for streaming speech recognition, speech translation and... |
|
Experimental |
| 4169 |
FarawaySail/Kaldi_thchs30
媒体与认知语音识别大作业 |
|
Experimental |
| 4170 |
meichthys/sword_drill
Displays Bible verses from parsed microphone input. |
|
Experimental |
| 4171 |
JunhoKim94/ASR_project
This repository created for the NHN ASR hackathon competition. |
|
Experimental |
| 4172 |
german-asr/nvidia-jasper-german
Scripts for training NVIDIA Jasper for German Speech Recognition (ASR). |
|
Experimental |
| 4173 |
loryanstrant/HA-ElevenLabs-Custom-TTS
An ElevenLabs TTS integration for Home Assistant that allows for creation of... |
|
Experimental |
| 4174 |
NetherQuartz/TextForSpeechNormalizer
A Python library to accentuate Russian text |
|
Experimental |
| 4175 |
Rajvardhman05/openwhisper-app
Free, open-source voice-to-text for macOS — 100% local, offline... |
|
Experimental |
| 4176 |
davidsuragan/issai-playground
A Python toolkit for accessing ISSAI’s AI services — Oylan (LLM), Soyle... |
|
Experimental |
| 4177 |
zvadaadam/speech-recognition
End to End Speech Recognition with Tensorflow |
|
Experimental |
| 4178 |
TeaPoly/cat_tensorflow
Crf-based Asr Toolkit with TensorFlow implement |
|
Experimental |
| 4179 |
TeaPoly/warp-ctc-crf
An extension of thu-spmi/CAT which contains a full-fledged implementation of... |
|
Experimental |
| 4180 |
upskyy/Automatic-Speech-Recognition-Models
End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra. |
|
Experimental |
| 4181 |
Kimosabey/vox-agent-neural
Neural Voice Agent core constructs for conversational AI. |
|
Experimental |
| 4182 |
yinruiqing/tiny-transducer
Tiny Transducer: A Highly-Efficient Speech Recognition Model on Edge Devices |
|
Experimental |
| 4183 |
sunprinceS/MetaASR-CrossAccent
Meta-Learning for End-to-End ASR |
|
Experimental |
| 4184 |
duc11021102/pyspeech
Python Text To Speech Using gTTS @duc11021102 |
|
Experimental |
| 4185 |
ActiveIntelligentSystemsLab/japanese_tts_ros
日本語テキストを音声として出力するROS node |
|
Experimental |
| 4186 |
derpeloper/ostinato
giving a voice to the voiceless. |
|
Experimental |
| 4187 |
lgpearson1771/openwakeword-trainer
Train custom wake word models with openWakeWord. A granular 13-step pipeline... |
|
Experimental |
| 4188 |
vijethph/violet-speech
Violet is a Speech Assistant made using Python |
|
Experimental |
| 4189 |
led-mirage/AivoClip
A.I.VOICEでクリップボードに貼り付けられたテキストを読み上げるアプリです。 |
|
Experimental |
| 4190 |
2tocom/F5-TTS-Vietnamese-Google-Colab
Vietnamese TTS, Chuyển văn bản thành giọng nói tiếng Việt, text to speech... |
|
Experimental |
| 4191 |
AssemblyAI/assemblyai-ruby-sdk
The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting... |
|
Experimental |
| 4192 |
emmanuelinfante/SubtitlesEveryone
Transcribe Like a Pro, Without Paying a Penny! |
|
Experimental |
| 4193 |
junhoeKu/Jeju-Translation
제주어, 표준어 양방향 음성 번역 모델 생성 프로젝트 (알고리즘 | 비정형 | NLP | 딥러닝 | 기계번역 | 음성인식 | 멀티모달) |
|
Experimental |
| 4194 |
BitsofJeremy/WeirDing
Audiobook narration engine powered by Qwen3-TTS. Upload documents, pick a... |
|
Experimental |
| 4195 |
Vatis-Tech/asr-client-js
JavaScript SDK client for Vatis Tech ASR services. |
|
Experimental |
| 4196 |
AssemblyAI/assemblyai-semantic-kernel
Transcribe audio using AssemblyAI with Semantic Kernel plugins. |
|
Experimental |
| 4197 |
marcogenna/epub2audiobook
Convert EPUB books to M4B audiobooks with AI-powered TTS (Edge TTS, Kokoro, Piper) |
|
Experimental |
| 4198 |
LucaAngioloni/Micchinetta
HCI project: an application interface using both face and speech recognition... |
|
Experimental |
| 4199 |
JoshuaCarroll/RepeaterProgrammingUtility
N5JLC Repeater Programming Utility |
|
Experimental |
| 4200 |
Listening-Lab/Annotator
Listening Lab audio analysis and annotation tool. Develop audio... |
|
Experimental |