All Voice AI Tools
6,981 tools ranked by quality score · Page 10 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 901 |
m3hrdadfi/soxan
Wav2Vec for speech recognition, classification, and audio classification |
|
Emerging |
| 902 |
zslrmhb/Omniverse-Virtual-Assisstant
Audio2Face Avatar with Riva SDK functionality |
|
Emerging |
| 903 |
symblai/getting-started-samples
Code samples to Get started quickly with Symbl's Voice SDK and APIs:... |
|
Emerging |
| 904 |
ycyy/faster-whisper-webui
a gradio webui for faster whisper |
|
Emerging |
| 905 |
thewh1teagle/piper-rs
Use piper TTS models in Rust |
|
Emerging |
| 906 |
Berkeley-Speech-Group/sylber
Sylber: Syllabic Embedding Representation of Speech from Raw Audio |
|
Emerging |
| 907 |
earlephilhower/BackgroundAudio
Arduino library for easy, interrupt driven speech, MP3, AAC, and WAV... |
|
Emerging |
| 908 |
zw76859420/ASR_WORD
采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。 |
|
Emerging |
| 909 |
yrom/finetune-index-tts
IndexTTS Fine-tuning notebooks |
|
Emerging |
| 910 |
HeyHeyChicken/NOVA-NodeJS
NOVA is a customizable voice assistant made with Node.js. |
|
Emerging |
| 911 |
OpenMOSS/MOSS-Speech
MOSS-Speech is a true speech-to-speech large language model without text guidance. |
|
Emerging |
| 912 |
huschen/kaggle_speech_recognition
Conv-LSTM-CTC speech recognition network (end-to-end), written in TensorFlow. |
|
Emerging |
| 913 |
coqui-ai/TTS-papers
🐸 collection of TTS papers |
|
Emerging |
| 914 |
tonesto7/echo-speaks
Integrate your Amazon Echo devices into your Hubitat environment to create... |
|
Emerging |
| 915 |
nixonyh/UnityTTS
Text to Speech in Unity. |
|
Emerging |
| 916 |
yerfor/SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022;... |
|
Emerging |
| 917 |
syntithenai/hermod
voice services stack from audio hardware through hotword, ASR, NLU, AI... |
|
Emerging |
| 918 |
zycv/awesome-keyword-spotting
This repository is a curated list of awesome Speech Keyword Spotting... |
|
Emerging |
| 919 |
just-ai/aimybox-android-sdk
Voice assistant SDK for Android |
|
Emerging |
| 920 |
FelixWaweru/elevenlabs-node
Eleven Labs text to speech package for NodeJS. You can use the official... |
|
Emerging |
| 921 |
awsaf49/audio_classification_models
Tensorflow Audio Classification Models |
|
Emerging |
| 922 |
Justmalhar/open-audio
Open-Audio TTS: A robust web app leveraging OpenAI's powerful Text-to-Speech... |
|
Emerging |
| 923 |
rishikksh20/TFGAN
TFGAN: Time and Frequency Domain Based Generative Adversarial Network for... |
|
Emerging |
| 924 |
coqui-ai/TTS-recipes
🐸TTS recipes for different datasets |
|
Emerging |
| 925 |
aofdev/vue-pwa-speech
A Vue2 Performs synchronous speech recognition Speech to text Google Cloud... |
|
Emerging |
| 926 |
Rubiksman78/MonikA.I
Submod for MAS with AI based features |
|
Emerging |
| 927 |
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family... |
|
Emerging |
| 928 |
persephone-tools/persephone
A tool for automatic phoneme transcription |
|
Emerging |
| 929 |
momysnow/Momy-Desk-Robot
Smart desktop robot. |
|
Emerging |
| 930 |
am-sokolov/videodubber
The program for automatic dubbing any video file for a lot of languages. |
|
Emerging |
| 931 |
oddlama/whisper-overlay
A wayland overlay providing speech-to-text functionality for any application... |
|
Emerging |
| 932 |
aofdev/vue-speech-streaming
A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech |
|
Emerging |
| 933 |
ttaoREtw/Tacotron-pytorch
A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model |
|
Emerging |
| 934 |
shenasa-ai/speech2text
A Deep-Learning-Based Persian Speech Recognition System |
|
Emerging |
| 935 |
apaar97/translate
Android app to translate text conversations, supporting 90+ languages with... |
|
Emerging |
| 936 |
Aratako/MioTTS-Inference
Inference server for MioTTS, a lightweight and fast LLM-based TTS model. |
|
Emerging |
| 937 |
nl8590687/ASRT_SDK_Java
ASRT Speech Recognition SDK for Java. 用于ASRT语音识别系统的Java SDK |
|
Emerging |
| 938 |
murf-ai/murf-python-sdk
Python sdk for Murf text to speech API |
|
Emerging |
| 939 |
jpescada/TwitterPiBot
A Python based bot for Raspberry Pi that grabs tweets with a specific... |
|
Emerging |
| 940 |
snakers4/open_stt
Open STT |
|
Emerging |
| 941 |
LSimon95/megatts2
Unoffical implementation of Megatts2 |
|
Emerging |
| 942 |
jim-schwoebel/download_audioset
📁 This repo makes it easy to download the raw audio files from AudioSet... |
|
Emerging |
| 943 |
jaco-bro/diajax
Dia-JAX: A JAX port of Dia, the text-to-speech model for generating... |
|
Emerging |
| 944 |
jishengpeng/WavTokenizer
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second... |
|
Emerging |
| 945 |
dunky11/voicesmith
[WIP] VoiceSmith makes training text to speech models easy. |
|
Emerging |
| 946 |
subho406/TF-Speech-Recognition-Challenge-Solution
Source code of the model used in Tensorflow Speech Recognition Challenge... |
|
Emerging |
| 947 |
drmfinlay/pyjsgf
JSpeech Grammar Format (JSGF) compiler, matcher and parser package for Python. |
|
Emerging |
| 948 |
sljeff/anycast
An AI-Powered Podcast App. |
|
Emerging |
| 949 |
hirofumi0810/asr_preprocessing
Python implementation of pre-processing for End-to-End speech recognition |
|
Emerging |
| 950 |
cuinjune/text2video
A software tool that converts text to video for more engaging learning experience |
|
Emerging |
| 951 |
mozilla-ai/speech-to-text-finetune
Blueprint by Mozilla.ai for finetuning a Speech-To-Text model in your own language |
|
Emerging |
| 952 |
PABannier/bark.cpp
Suno AI's Bark model in C/C++ for fast text-to-speech generation |
|
Emerging |
| 953 |
gurjar1/OmniDictate
Free, open-source, real-time dictation for Windows. Runs locally (no... |
|
Emerging |
| 954 |
Pikurrot/whisper-gui
A simple GUI to use Whisper. |
|
Emerging |
| 955 |
igorshmukler/kokoro-ruslan
Kokoro Language Model Training Script for Russian (Ruslan Corpus) |
|
Emerging |
| 956 |
rajkishorbgp/JARVIS-AI-Assistant
JARVIS AI Assistant 🤖 A virtual assistant project inspired by Tony Stark's... |
|
Emerging |
| 957 |
SEPIA-Framework/sepia-stt-server
SEPIA server to support open-source speech recognition via WebSocket connection. |
|
Emerging |
| 958 |
matteo-convertino/vosk-build-model
How to create your own model for vosk |
|
Emerging |
| 959 |
finchvox/finchvox
Voice AI Observability, Elevated |
|
Emerging |
| 960 |
travisvn/edge-tts-extension
Chrome extension to generate free, high-quality text-to-speech using... |
|
Emerging |
| 961 |
lucasnewman/f5-tts-swift
Implementation of F5-TTS in Swift using MLX |
|
Emerging |
| 962 |
SynHub/syn-speech
Syn.Speech is a flexible speaker independent continuous speech recognition... |
|
Emerging |
| 963 |
rse/speechflow
Speech Processing Flow Graph |
|
Emerging |
| 964 |
Amirrezahmi/Zozo-Assistant
Zozo Assistant is a voice-activated chatbot that performs tasks based on... |
|
Emerging |
| 965 |
Azure-Samples/sonic-brief
Sonic Brief Project is an Azure-based system that transcribes and... |
|
Emerging |
| 966 |
neosun100/cosyvoice-docker
🎙️ CosyVoice All-in-One Docker - Production-ready TTS with Web UI, REST API... |
|
Emerging |
| 967 |
djmango/obsidian-transcription
Obsidian plugin to create high-quality transcriptions from markdown linked... |
|
Emerging |
| 968 |
isaiahbjork/expo-kokoro-onnx
Run Kokoro TTS locally on device using Expo & ONNX Runtime |
|
Emerging |
| 969 |
upskyy/Transformer-Transducer
PyTorch implementation of "Transformer Transducer: A Streamable Speech... |
|
Emerging |
| 970 |
Markfryazino/wav2lip-hq
Extension of Wav2Lip repository for processing high-quality videos. |
|
Emerging |
| 971 |
bawangxx/XZVoice
Free and open source text-to-speech software |
|
Emerging |
| 972 |
TimoBolkart/voca
This codebase demonstrates how to synthesize realistic 3D character... |
|
Emerging |
| 973 |
Cvandia/nonebot-plugin-fishspeech-tts
适用于nonebot2的fish-speech和fish-audio的tts插件 |
|
Emerging |
| 974 |
isomoes/blivedm_rs
一个功能强大的 Bilibili 直播间弹幕 WebSocket 客户端 Rust 库,支持实时弹幕监控、文字转语音(TTS)和浏览器 Cookie... |
|
Emerging |
| 975 |
speechmatics/speechmatics-python-sdk
Python SDKs for Speechmatics APIs |
|
Emerging |
| 976 |
aman179102/podvoice
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style... |
|
Emerging |
| 977 |
sovaai/sova-asr
SOVA ASR (Automatic Speech Recognition) |
|
Emerging |
| 978 |
antor44/livestream_video
playlist4whisper manages media streams playlists for livestream_video.sh,... |
|
Emerging |
| 979 |
ImNimboss/uberduck
A synchronous and asynchronous API wrapper for the UberDuck text-to-speech... |
|
Emerging |
| 980 |
lixiangyu890601/EasyAICC-Easy-AI-Call-Center
外呼系统,智能外呼,自动外呼系统,人工外呼,呼叫中心 |
|
Emerging |
| 981 |
lukaszliniewicz/Pandrator
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos... |
|
Emerging |
| 982 |
yl4579/StyleTTS-VC
Official Implementation of StyleTTS-VC |
|
Emerging |
| 983 |
opensource-spraakherkenning-nl/Kaldi_NL
Code related to the Dutch instance and user groups of the KALDI speech... |
|
Emerging |
| 984 |
Pankaj-Baranwal/pocketsphinx
Updated ROS bindings to pocketsphinx |
|
Emerging |
| 985 |
harmlessman/PAFTS
PAFTS : Library That Preprocessing Audio For TTS. |
|
Emerging |
| 986 |
lablab-ai/OpenAI_Whisper_Streamlit
A minimalistic automatic speech recognition streamlit based webapp powered... |
|
Emerging |
| 987 |
wit-ai/android-voice-demo
Example on how to build a voice-enabled Android app with Wit.ai |
|
Emerging |
| 988 |
meemalabs/laravel-text-to-speech
💬 A wrapper for popular TTS services to create a more simple & uniform API.... |
|
Emerging |
| 989 |
dmisol/flexatar-virtual-webcam
Personalized Virtual Webcam for WebRTC |
|
Emerging |
| 990 |
felixchenfy/Speech-Commands-Classification-by-LSTM-PyTorch
Classification of 11 types of audio clips using MFCCs features and LSTM.... |
|
Emerging |
| 991 |
aws-solutions/content-localization-on-aws
Automatically generate multi-language subtitles using AWS AI/ML services.... |
|
Emerging |
| 992 |
yuhr/langue
A modern platform for conlanging. Currently in the planning stage. |
|
Emerging |
| 993 |
drien/tts-joinery
Stitch together text-to-speech over 4096 characters via the OpenAI API |
|
Emerging |
| 994 |
pandeydivesh15/AVSR-Deep-Speech
Google Summer of Code 2017 Project: Development of Speech Recognition Module... |
|
Emerging |
| 995 |
notAI-tech/IndicASR
Speeech Recognition for Indic languages. |
|
Emerging |
| 996 |
jojojaeger/whisper-streamlit
this master thesis project is based on OpenAI Whisper with the goal to... |
|
Emerging |
| 997 |
Hecate2/sukasuka-vocal-dataset-builder
すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from... |
|
Emerging |
| 998 |
MohammedRashad/FPGA-Speech-Recognition
Expiremental Speech Recognition System using VHDL & MATLAB. |
|
Emerging |
| 999 |
haguro/elevenlabs-go
A Go API client library for the ElevenLabs speech synthesis platform |
|
Emerging |
| 1000 |
youmebangbang/TTS-dataset-tools
Automatically generates TTS dataset using audio and associated text. Make... |
|
Emerging |