All Voice AI Tools
6,981 tools ranked by quality score · Page 9 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 801 |
yc9701/pansori
Tools for ASR Corpus Generation from Online Video |
|
Emerging |
| 802 |
funcwj/aps
A personal toolkit for single/multi-channel speech recognition & enhancement... |
|
Emerging |
| 803 |
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing |
|
Emerging |
| 804 |
gfdb/wav2aug
A general purpose task-agnostic speech augmentation policy |
|
Emerging |
| 805 |
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System |
|
Emerging |
| 806 |
silversparro/wav2letter.pytorch
A fully convolution-network for speech-to-text, built on pytorch. |
|
Emerging |
| 807 |
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion... |
|
Emerging |
| 808 |
lucadellalib/focalcodec
A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation |
|
Emerging |
| 809 |
ceuk/speech-recognition-aws-polyfill
Polyfill for the SpeechRecognition browser API using AWS Transcribe as a fallback |
|
Emerging |
| 810 |
Kyubyong/cross_vc
Cross-lingual Voice Conversion |
|
Emerging |
| 811 |
KevinMIN95/StyleSpeech
Official implementation of Meta-StyleSpeech and StyleSpeech |
|
Emerging |
| 812 |
CSTR-Edinburgh/magphase
MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications. |
|
Emerging |
| 813 |
baidubce/pie
百度云流式语音识别客户端 SDK |
|
Emerging |
| 814 |
Elleo/pied
Pied makes it simple to install and manage text-to-speech Piper voices for... |
|
Emerging |
| 815 |
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head |
|
Emerging |
| 816 |
nipponjo/tts-arabic-pytorch
🎙️ Arabic TTS models (Tacotron2, FastPitch) |
|
Emerging |
| 817 |
OpenVoiceOS/ovos-tts-plugin-cotovia
galician tts plugin for OVOS |
|
Emerging |
| 818 |
MysteryPancake/Discord-TTS
Text to speech Discord bot using FakeYou |
|
Emerging |
| 819 |
chaiyujin/dctts-pytorch
The pytorch implementation of DC-TTS |
|
Emerging |
| 820 |
rishikksh20/vae_tacotron2
VAE Tacotron 2, an alternative of GST Tacotron |
|
Emerging |
| 821 |
RapidAI/RapidASR
📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR... |
|
Emerging |
| 822 |
Cay-Zhang/SwiftSpeech
A speech recognition framework designed for SwiftUI. |
|
Emerging |
| 823 |
rioharper/VocalForge
Your one-stop solution for voice dataset creation |
|
Emerging |
| 824 |
Voine/Bert-VITS2-MNN
TTS System Bert-VITS2 Android Ver, powered by alibaba-MNN engine. |
|
Emerging |
| 825 |
ubisoft/ubisoft-laforge-daft-exprt
Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis |
|
Emerging |
| 826 |
zhao-kun/VibeVoiceFusion
VibeVoiceFusion is a full-stack, multi-speaker voice generation web system... |
|
Emerging |
| 827 |
Picovoice/falcon
On-device speaker diarization powered by deep learning |
|
Emerging |
| 828 |
benjaminwan/ChineseTtsTflite
Android Chinese TTS Engine Base On Tensorflow TTS , use for TfLite Models... |
|
Emerging |
| 829 |
danthelion/doc2audiobook
Convert text documents to high fidelity audio(books). |
|
Emerging |
| 830 |
Niger-Volta-LTI/yoruba-text
Yorùbá language training text for NLP, ASR and TTS tasks |
|
Emerging |
| 831 |
Oknolaz/vasisualy
Vasisualy it's a simple Russian-language voice assistant written on Python... |
|
Emerging |
| 832 |
BernieTv/ElevenLabs-Clone
A self-hosted ElevenLabs clone for text-to-speech, voice conversion, and AI... |
|
Emerging |
| 833 |
mush42/sonata-nvda
This add-on implements a speech synthesizer driver for NVDA using neural TTS... |
|
Emerging |
| 834 |
h5p/h5p-speak-the-words
Create questions answered through speech |
|
Emerging |
| 835 |
fewieden/MMM-voice
Offline Voice Recognition Module for MagicMirror² |
|
Emerging |
| 836 |
NATSpeech/NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official... |
|
Emerging |
| 837 |
daniilrobnikov/vits2
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with... |
|
Emerging |
| 838 |
OAID/cortex-m-kws
Cortex M KWS example with Tengine Lite. |
|
Emerging |
| 839 |
HAKORADev/VODER
Voice Operation and Design Engine with Reproduction capabilities |
|
Emerging |
| 840 |
by2101/OpenASR
A pytorch based end2end speech recognition system. |
|
Emerging |
| 841 |
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller,... |
|
Emerging |
| 842 |
rishikksh20/Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis |
|
Emerging |
| 843 |
ccoreilly/LocalSTT
Android Speech Recognition Service using Vosk/Kaldi and Mozilla DeepSpeech |
|
Emerging |
| 844 |
synesthesiam/rhasspy
Rhasspy voice assistant for offline home automation |
|
Emerging |
| 845 |
baizeteam/baize-toolbox
白泽工具箱,基于electron+ffmpeg实现的一款功能强大的多媒体工具 |
|
Emerging |
| 846 |
PrzemyslawSwiderski/python-gradle-plugin
Gradle plugin to run Python projects. |
|
Emerging |
| 847 |
chrisjp/tts
A simple tool to demo text-to-speech using various services' voices. HTML5... |
|
Emerging |
| 848 |
siva-sub/NekoSpeak
Private, offline AI Text-to-Speech for Android with Kokoro, KittenTTS,... |
|
Emerging |
| 849 |
ArdaGnsrn/elevenlabs-laravel
This is an Open Source PHP Laravel package for ElevenLabs Text to Speech API. |
|
Emerging |
| 850 |
Kaljurand/Inimesed
An Android app that lets you search your contacts by voice. Internet not... |
|
Emerging |
| 851 |
deepgram-devs/nextjs-text-to-speech
Get started using Deepgram's Text-to-Speech with this Next.js demo app |
|
Emerging |
| 852 |
Mobile-Artificial-Intelligence/babylon
Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and... |
|
Emerging |
| 853 |
areebbeigh/winspeech
Speech recognition and synthesis library for Windows - Python 2 and 3. |
|
Emerging |
| 854 |
nl8590687/ASRT_SDK_WinClient
An Windows client SDK and Demo software for ASRT speech recognition system.... |
|
Emerging |
| 855 |
daanzu/deepspeech-websocket-server
Server & client for DeepSpeech using WebSockets for real-time speech... |
|
Emerging |
| 856 |
spring-media/DeepPhonemizer
Grapheme to phoneme conversion with deep learning. |
|
Emerging |
| 857 |
Tinkoff/voicekit-examples
Examples on how to use Tinkoff Voicekit |
|
Emerging |
| 858 |
skit-ai/kaldi-serve
Server framework for Kaldi ASR Toolkit |
|
Emerging |
| 859 |
juliuskunze/speechless
Speech-to-text based on wav2letter built for transfer learning |
|
Emerging |
| 860 |
RaduBolbo/F5-TTS-Emotional-CFG
Zero-shot voice cloning text-to-speech (TTS) with explicit emotion class... |
|
Emerging |
| 861 |
trldvix/youtube-transcript-api
Java library which allows you to retrieve subtitles/transcripts for a single... |
|
Emerging |
| 862 |
Amirrezahmi/SelfTalker
Engage in conversation with your virtual self using AI techniques like NLP,... |
|
Emerging |
| 863 |
GetcharZp/go-speech
go-speech 基于 Golang + ONNX 构建的轻量语音库,支持 TTS(文本转语音)与 ASR(语音转文字)。已集成... |
|
Emerging |
| 864 |
mlalma/MisakiSwift
Swift port of Misaki G2P (grapheme-to-phoneme) library that can be used e.g.... |
|
Emerging |
| 865 |
createcandle/voco
Privacy friendly voice control for the Candle Controller / WebThings... |
|
Emerging |
| 866 |
IBM/speech-to-text-code-pattern
WARNING: This repository is no longer maintained |
|
Emerging |
| 867 |
rhasspy/rhasspy
Offline private voice assistant for many human languages |
|
Emerging |
| 868 |
yukukotani/pi-voice
Headless voice interface for the Pi Coding Agent |
|
Emerging |
| 869 |
vineeths96/Spoken-Keyword-Spotting
In this repository, we explore using a hybrid system consisting of a... |
|
Emerging |
| 870 |
maum-ai/univnet
Unofficial PyTorch Implementation of UnivNet Vocoder... |
|
Emerging |
| 871 |
theblackcat102/edgedict
Working online speech recognition based on RNN Transducer. ( Trained model... |
|
Emerging |
| 872 |
nnsvs/nnsvs
Neural network-based singing voice synthesis library for research |
|
Emerging |
| 873 |
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI |
|
Emerging |
| 874 |
gabriele-mastrapasqua/qwen3-tts
Pure C inference engine for Qwen3-TTS text-to-speech. No Python, no PyTorch... |
|
Emerging |
| 875 |
PraaneshSelvaraj/speech_engine
Speech Engine is a Python package that provides a simple interface for... |
|
Emerging |
| 876 |
WindQAQ/listen-attend-and-spell
Tensorflow implementation of "Listen, Attend and Spell" authored by William... |
|
Emerging |
| 877 |
livingingroups/animal2vec
animal2vec: A self-supervised transformer for rare-event raw audio input |
|
Emerging |
| 878 |
yaph/tts-samples
This repository provides text-to-speech (TTS) audio samples in MP3 format... |
|
Emerging |
| 879 |
gsssrao/UnityAndroidSpeechRecognition
This repository is a Unity plugin for Android Speech Recognition (based on... |
|
Emerging |
| 880 |
rhasspy/piper
A fast, local neural text to speech system |
|
Emerging |
| 881 |
Kaljurand/speechutils
Android library for speech-to-text and text-to-speech apps |
|
Emerging |
| 882 |
see2023/Bert-VITS2-ext
基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2. |
|
Emerging |
| 883 |
AlexandaJerry/whisper-vits-japanese
Vits Japanese with Whisper as data processor (you can train your VITS even... |
|
Emerging |
| 884 |
georgesterpu/avsr-tf1
Audio-Visual Speech Recognition using Sequence to Sequence Models |
|
Emerging |
| 885 |
shhossain/BanglaTTS
BanglaTTS is a text-to-speech (TTS) system for Bangla language that works in... |
|
Emerging |
| 886 |
szimek/webrtc-translate
Highly experimental (read: "barely working") app that uses WebRTC API and... |
|
Emerging |
| 887 |
hash2430/pitchtron
TTS for pitch-accented language. Korean dialect DB. |
|
Emerging |
| 888 |
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU. |
|
Emerging |
| 889 |
Onuronon-lab/Shrutik
Open-source voice data collection platform for building inclusive voice... |
|
Emerging |
| 890 |
i4Ds/whisper-finetune
This repository contains code for fine-tuning the Whisper speech-to-text model. |
|
Emerging |
| 891 |
mozhou-tech/kim-voice-assistant
Kim,your personal voice kit for Home Inteligence. |
|
Emerging |
| 892 |
algolia/voice-overlay-android
🗣 An overlay that gets your user’s voice permission and input as text in a... |
|
Emerging |
| 893 |
Candida18/Virtual-Assistance-For-The-Blind
The proposed Voice-based Email System uses AI (voice commands) that will... |
|
Emerging |
| 894 |
r9y9/ttslearn
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python) |
|
Emerging |
| 895 |
inclusionAI/Ming-UniAudio
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing... |
|
Emerging |
| 896 |
resemble-ai/resemble-alexa
This is sample code for an Alexa skill that uses realistic voice cloning... |
|
Emerging |
| 897 |
maum-ai/assem-vc
Official Code for Assem-VC @ICASSP2022 |
|
Emerging |
| 898 |
CheshireCC/faster-whisper-GUI
faster_whisper GUI with PySide6 |
|
Emerging |
| 899 |
markomijic/TTS-Mod-Vault
Cross-platform Tabletop Simulator mod backup & download tool — the modern... |
|
Emerging |
| 900 |
apluka34/Bud500
Bud500: A Comprehensive Vietnamese ASR Dataset |
|
Emerging |