All Voice AI Tools
6,981 tools ranked by quality score · Page 11 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 1001 |
R1ckShi/AESRC2020
[ICASSP2021] Data preperation scripts, training pipeline and baseline... |
|
Emerging |
| 1002 |
chrisurf/obsidian-voice
🔊 The Obsidian Voice plugin lets you listen to your written content being... |
|
Emerging |
| 1003 |
mitchib1440/SpeakThat
The world's most comprehensive notification reader for Android devices. |
|
Emerging |
| 1004 |
thetobysiu/Deepstory
Deepstory turns a text/generated text into a video where the character is... |
|
Emerging |
| 1005 |
dbklim/Voice_ChatBot
Chatbot in russian with speech recognition using PocketSphinx and speech... |
|
Emerging |
| 1006 |
Appen/UHV-OTS-Speech
A data annotation pipeline to generate high-quality, large-scale speech... |
|
Emerging |
| 1007 |
gunarakulangunaretnam/real-time-language-translator
A voice recognition-based tool for translating languages in real-time. |
|
Emerging |
| 1008 |
rishikksh20/melgan
MelGAN implementation with Multi-Band and Full Band supports... |
|
Emerging |
| 1009 |
xyqfer/reader
毕业设计-基于智能手机的报纸阅读器 |
|
Emerging |
| 1010 |
ai-learning-tools/viva-translate
Real-time translation copilot for your browser |
|
Emerging |
| 1011 |
candlewill/Speech-Corpus-Collection
A Collection of Speech Corpus for ASR and TTS |
|
Emerging |
| 1012 |
ttop32/coqui_tts_korea
Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS |
|
Emerging |
| 1013 |
jianchang512/zh_recogn
将音频或视频中的中文语音识别并导出为srt字幕,基于魔塔社区Paraformer模型 |
|
Emerging |
| 1014 |
stefantaubert/mel-cepstral-distance
A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral... |
|
Emerging |
| 1015 |
AIFSH/ComfyUI-XTTS
a custom comfyui node for coqui-ai/TTS's xtts module! support 17 languages... |
|
Emerging |
| 1016 |
tomasz-oponowicz/spoken_language_identification
Identify a spoken language using artificial intelligence (LID). |
|
Emerging |
| 1017 |
sp-nitech/DNN-HSMM
pytorch implementation of DNN-HSMM for TTS |
|
Emerging |
| 1018 |
EricBatlle/UnityAndroidSpeechRecognizer
🗣️ Speech recognition on Unity and Android without the annoying google popup! |
|
Emerging |
| 1019 |
taresh18/TTSizer
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific... |
|
Emerging |
| 1020 |
zceng/LVCNet
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation |
|
Emerging |
| 1021 |
keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement... |
|
Emerging |
| 1022 |
chenmingxiang110/Chinese-automatic-speech-recognition
Chinese speech recognition |
|
Emerging |
| 1023 |
kurianbenoy/Indic-Subtitler
Open source subtitling platform 💻 for transcribing and translating... |
|
Emerging |
| 1024 |
teamsudocode/dexter
Let your talking do the code |
|
Emerging |
| 1025 |
garvys-org/rustfst
Rust re-implementation of OpenFST - library for constructing, combining,... |
|
Emerging |
| 1026 |
asticode/go-astideepspeech
Golang bindings for Mozilla's DeepSpeech speech-to-text library |
|
Emerging |
| 1027 |
journey-ad/CosyVoice2-Ex
CosyVoice2 功能扩充(预训练音色推理/3s极速复刻/自然语言控制/自动识别/音色模型保存/API) |
|
Emerging |
| 1028 |
chandran-jr/Noteify
🔎A Currency Detection app for the visually impaired which automatically... |
|
Emerging |
| 1029 |
keonlee9420/Cross-Speaker-Emotion-Transfer
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based... |
|
Emerging |
| 1030 |
cboard-org/cboard-api
Cboard API provides backend functionality and persistence to the Cboard application |
|
Emerging |
| 1031 |
xxbb1234021/speech_recognition
中文语音识别 |
|
Emerging |
| 1032 |
JosefAlbers/WTM
Blazing fast whisper turbo for ASR (speech-to-text) tasks |
|
Emerging |
| 1033 |
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper |
|
Emerging |
| 1034 |
devnen/Kitten-TTS-Server
Self-host the ultra-lightweight Kitten TTS model with this enhanced API... |
|
Emerging |
| 1035 |
Sgvkamalakar/Azure-Talking-Avatar
Explore the power of Azure Text-to-Speech with interactive talking avatar,... |
|
Emerging |
| 1036 |
metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation,... |
|
Emerging |
| 1037 |
developers-cosmos/Mimasa
Real time multilingual face translator |
|
Emerging |
| 1038 |
hi-paris/Prosody-Control-French-TTS
An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control |
|
Emerging |
| 1039 |
AASHISHAG/deepspeech-german
Automatic Speech Recognition (ASR) - German |
|
Emerging |
| 1040 |
alex-vt/WhisperInput
Offline voice input panel & keyboard with punctuation for Android. |
|
Emerging |
| 1041 |
agan-j/xiaoniu
小牛视频翻译 是一款支持本地视频翻译、字幕翻译和 YouTube 视频翻译下载的 AI... |
|
Emerging |
| 1042 |
rhulha/StreamingKokoroJS
Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100%... |
|
Emerging |
| 1043 |
Rongjiehuang/GenerSpeech
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model... |
|
Emerging |
| 1044 |
keonlee9420/PortaSpeech
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative... |
|
Emerging |
| 1045 |
NickZaitsev/ru-normalizr
ru-normalizr — лучший open-source нормализатор русского текста. Приводит... |
|
Emerging |
| 1046 |
fedden/RenderMan
Command line C++ and Python VSTi Host library with MFCC, FFT, RMS and audio... |
|
Emerging |
| 1047 |
tema6120/ForgetMeNot
A flashcard app for Android. |
|
Emerging |
| 1048 |
ponlponl123/-Prototype-AIVTuber
a open-source Artificial Intelligence Virtual Youtuber (AI VTuber), (this... |
|
Emerging |
| 1049 |
novoic/surfboard
Novoic's audio feature extraction library |
|
Emerging |
| 1050 |
ryanleary/patter
speech-to-text in pytorch |
|
Emerging |
| 1051 |
timmo001/home-assistant-assist-desktop
Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux |
|
Emerging |
| 1052 |
mush42/sonata
A cross-platform inference engine for neural TTS models. |
|
Emerging |
| 1053 |
huakunyang/SummerTTS
SummerTTS... |
|
Emerging |
| 1054 |
voicetestdev/voicetest
Test harness for voice agents. Import from Retell, VAPI, Bland, LiveKit. Run... |
|
Emerging |
| 1055 |
keonlee9420/VAENAR-TTS
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based... |
|
Emerging |
| 1056 |
algolia/voice-overlay-ios
🗣 An overlay that gets your user’s voice permission and input as text in a... |
|
Emerging |
| 1057 |
linux-speakup/espeakup
a light weight connector for espeak-ng and speakup |
|
Emerging |
| 1058 |
huawei-noah/Speech-Backbones
This is the main repository of open-sourced speech technology by Huawei... |
|
Emerging |
| 1059 |
wangkaisine/mrcp-plugin-with-freeswitch
使用FreeSWITCH接受用户手机呼叫,通过UniMRCP... |
|
Emerging |
| 1060 |
qianchang/zici
字词:收集国学/汉语字词拼音相关资源 |
|
Emerging |
| 1061 |
Jakobovski/free-spoken-digit-dataset
A free audio dataset of spoken digits. An audio version of MNIST. |
|
Emerging |
| 1062 |
superstarryeyes/lue
Terminal eBook Reader with Audiobook-Quality Text-to-Speech — Supports EPUB,... |
|
Emerging |
| 1063 |
Baidu-AIP/speech-demo
语音api示例 |
|
Emerging |
| 1064 |
karim23657/Persian-tts-coqui
Persian/Farsi text to speech(TTS) training using coqui tts |
|
Emerging |
| 1065 |
kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with... |
|
Emerging |
| 1066 |
hujingshuang/MTrans
Multi-source Translation |
|
Emerging |
| 1067 |
mozilla/DeepSpeech-examples
Examples of how to use or integrate DeepSpeech |
|
Emerging |
| 1068 |
jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM... |
|
Emerging |
| 1069 |
bjoernkarmann/project_alias
Alias is a teachable “parasite” that is designed to give users more control... |
|
Emerging |
| 1070 |
OpenCOVID19CoughCheck/CoughCheckApp
Development of AI audio app to compare the cough of a Coronavirus (COVID-19)... |
|
Emerging |
| 1071 |
wangz-code/legado-tts
Book Reader阅读Legado 应用内置EdgeTTS大声朗读, 听书无需额外部署 即装即听, 语音引擎采用rany2/edge-tts... |
|
Emerging |
| 1072 |
caizexin/tf_multispeakerTTS_fc
the Tensorflow version of multi-speaker TTS training with feedback constraint |
|
Emerging |
| 1073 |
nyrahealth/CrisperWhisper
Verbatim Automatic Speech Recognition with improved word-level timestamps... |
|
Emerging |
| 1074 |
abdozmantar/ComfyUI-DeepExtractV2
DeepExtractV2 – lightning-fast, high-quality audio separator. Instantly... |
|
Emerging |
| 1075 |
nobody132/masr
中文语音识别; Mandarin Automatic Speech Recognition; |
|
Emerging |
| 1076 |
alexpinel/Dot
Text-To-Speech, RAG, and LLMs. All local! |
|
Emerging |
| 1077 |
LiberSonora/LiberSonora
LiberSonora,寓意“自由的声音”,是一个 AI 赋能的、强大的、开源有声书工具集,包含智能字幕提取、AI标题生成、多语言翻译等功能,支持... |
|
Emerging |
| 1078 |
hcy71o/AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation... |
|
Emerging |
| 1079 |
primaryobjects/voice-gender
Gender recognition by voice and speech analysis |
|
Emerging |
| 1080 |
BogiHsu/WG-WaveNet
Real-Time High-Fidelity Speech Synthesis without GPU |
|
Emerging |
| 1081 |
MattePalte/Verbify-TTS
Simple and free Text-to-Speech (TTS) engine that reads for you any text on... |
|
Emerging |
| 1082 |
shekit/alexa-sign-language-translator
A project to make Amazon Echo respond to sign language using your webcam |
|
Emerging |
| 1083 |
Rongjiehuang/Multi-Singer
PyTorch Implementation of Multi-Singer (ACM-MM'21) |
|
Emerging |
| 1084 |
PyThaiNLP/tts-thai
Thai TTS |
|
Emerging |
| 1085 |
lukeewin/FunASR_API
这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech... |
|
Emerging |
| 1086 |
Navatusein/Silero-TTS-Service
Silero TTS backend service. Can be used with Home Assistant and Rhasspy. |
|
Emerging |
| 1087 |
keonlee9420/StyleSpeech
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive... |
|
Emerging |
| 1088 |
artibex/piper-http
Creates a docker image that runs the piper http service |
|
Emerging |
| 1089 |
botany-labs/voice-ai-js-starter
Starter project for building real-time AI Voice Assistants |
|
Emerging |
| 1090 |
repodiac/german_transliterate
Python module to clean and transliterate (i.e. normalize) German text... |
|
Emerging |
| 1091 |
mravanelli/pytorch-kaldi
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid... |
|
Emerging |
| 1092 |
Saganaki22/ComfyUI-Maya1_TTS
A ComfyUI node for Maya1, a 3B-parameter speech model built for expressive... |
|
Emerging |
| 1093 |
googlecreativelab/obvi
A Polymer 3+ webcomponent / button for doing speech recognition |
|
Emerging |
| 1094 |
trungnguyen21/AutomatedYoutubeShorts
Automatically Generate video based on given content! |
|
Emerging |
| 1095 |
keonlee9420/FastPitchFormant
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based... |
|
Emerging |
| 1096 |
neosapience/editts
Official implementation of EdiTTS: Score-based Editing for Controllable... |
|
Emerging |
| 1097 |
sberdevices/assistant-client
Инструмент для тестирования и отладки СanvasApps — навыков семейства... |
|
Emerging |
| 1098 |
googlecreativelab/morse-speak-demo
Text-to-Speech (TTS) demo web app that converts written text into spoken... |
|
Emerging |
| 1099 |
declare-lab/jamify
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and... |
|
Emerging |
| 1100 |
michaelzhang-ai/Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with... |
|
Emerging |