All Voice AI Tools

6,981 tools ranked by quality score · Page 18 of 70

Showing 1701–1800 of 6,981
# Tool Score Tier
1701 gooofy/py-marytts

Python MaryTTS HTTP client library

36
Emerging
1702 andresayac/edge-tts-php

Edge TTS is a PHP package that allows access to the online text-to-speech...

36
Emerging
1703 Ezdokz1337/sunona-v0.001

🎤 Build and deploy intelligent voice AI agents in minutes with Sunona, your...

36
Emerging
1704 rishikksh20/LightSpeech

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

36
Emerging
1705 mu-hashmi/personaplex-mlx

PersonaPlex on Apple Silicon: an MLX port of NVIDIA’s full-duplex...

36
Emerging
1706 Youdef20/voxtral.c

🔊 Streamline audio processing with Voxtral.c, a pure C implementation for...

36
Emerging
1707 mobilepadawan/Speakit-JS

Elevate your web applications with the power of JavaScript speech synthesis.

36
Emerging
1708 instavar/qwen3-tts-lora-finetuning

Qwen3‑TTS LoRA fine‑tuning tools (companion repo) for custom voice adaptation

36
Emerging
1709 ZeroneBit/Edge-TTS-Net

Use Microsoft Edge's online text-to-speech service from .NET WITHOUT needing...

36
Emerging
1710 agentvoiceresponse/avr-tts-elevenlabs

This repository demonstrates the integration between Agent Voice Response...

36
Emerging
1711 cxyfer/GeminiASR

A Python tool that uses Google Gemini API to transcribe video or audio files...

36
Emerging
1712 Hritikraj8804/Autotube

🤖 Automated YouTube Shorts creation using n8n, AI script generation, and...

36
Emerging
1713 niker/EdgeTtsSharp

EdgeTTS Sharp is a library that provides an easy-to-use, realtime-streaming,...

36
Emerging
1714 sdip15fa/safecantonese.ai.app

Free, open-source, offline, safe and secure AI Cantonese transcription, in...

36
Emerging
1715 mgonzs13/tts_ros

Text-to-Speech for ROS 2

36
Emerging
1716 bakaburg1/minutemaker

Generate meeting minutes starting from an audio recording or a transcripts...

36
Emerging
1717 lefinepro/cogni

Workflow automation as simple as taking notes

36
Emerging
1718 junjie-xyz/whisper-video

Generate subtitles for all the videos in a folder with OpenAI's Whisper...

35
Emerging
1719 emiliioaguirre/youtube-live-tts

Real-time YouTube Live Chat Text-to-Speech (TTS) using ElevenLabs AI voices

35
Emerging
1720 superU-ai/voice-agent-QA

A unified benchmarking framework for evaluating Voice AI agents across...

35
Emerging
1721 kamiazya/ngx-speech-recognition

Angular 5+ speech recognition service (based on browser implementation such...

35
Emerging
1722 twilio-labs/sample-autopilot-voice-ivr

Voice-Powered IVR Chatbot with Autopilot

35
Emerging
1723 GravityPoet/ChordVox

Your voice is the fastest keyboard. Local AI voice input — speak, AI polish,...

35
Emerging
1724 Gauff/EpubToAudioBookConverter

Convert EPUB files to MP3 audio books with ease using this intuitive and...

35
Emerging
1725 sljavi/handsfree-for-web-control-speech-recognition-module

Handsfree for Web module useful to ask for start or stop listening for voice commands

35
Emerging
1726 Hamtech-ai/wav2vec2-fa

fine-tune Wav2vec2. an ASR model released by Facebook

35
Emerging
1727 SameeraMurthy/sanskrit-tts

Generate Text-to-Speech for Sanskrit

35
Emerging
1728 rhasspy/piper-samples

Samples for Piper text to speech system

35
Emerging
1729 ziligy/watson-text-talker

Simple python Text-to-Speech Interface using IBM's Watson TTS

35
Emerging
1730 LEMAS-Project/LEMAS-TTS

LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10...

35
Emerging
1731 HarunoriKawano/BEST-RQ

Implementation of the paper "Self-supervised Learning with Random-projection...

35
Emerging
1732 a-n-rose/Python-Sound-Tool

SoundPy (alpha stage) is a research-based python package for speech and...

35
Emerging
1733 pncnmnp/phoenix10.1

Creates personalized radio stations with your own radio jockey!

35
Emerging
1734 billiax/voxglide

Embeddable voice AI SDK for web pages — speak to fill forms, click buttons,...

35
Emerging
1735 maketheproduct/flowstay

Flowstay is a MacOS app that allows instant transcription across all your...

35
Emerging
1736 saurabhchalke/whisper-meta-quest

Running speech-to-text in a Meta Quest headset using OpenAI's Whisper tiny model

35
Emerging
1737 opencog/TinyCog

Small Robot, Toy Robot platform

35
Emerging
1738 vectominist/spin

Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for...

35
Emerging
1739 phyce/Narration-Studio

Narration Studio, your all in one TTS Solution!

35
Emerging
1740 gachi0/konishiTTS

VOICEVOXを使用したのDiscordの読み上げbot

35
Emerging
1741 AkojimaSLP/Frame-by-frame-closed-form-update-for-mask-based-adaptive-MVDR-beamforming

speech-enhacement

35
Emerging
1742 DivineUX23/Audio-to-Audio-translation

Imagine translating your speech or anybody's speech to any language you want...

35
Emerging
1743 StanGirard/speechdigest

Audio to summary with openAI Whisper & GPT 3.5/4 using streamlit

35
Emerging
1744 huckiyang/QuantumSpeech-QCNN

IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing...

35
Emerging
1745 Ikaros-521/FunASR_WS

基于FunASR官方Demo修改的WS服务端,配合FastAPI提供HTTP服务,可以在浏览器中进行实时ASR测试

35
Emerging
1746 IS2AI/ISSAI_SAIDA_Kazakh_ASR

the first industrial-scale open-source Kazakh speech corpus. KSC2 corpus...

35
Emerging
1747 mathigatti/RealTimeSingingSynthesizer

Live Coding Singing Synthesizer. Python sinsy-NG wrapper.

35
Emerging
1748 artcore-c/AI-Voice-Clone-with-Qwen3-TTS

Free voice cloning and TTS for creators using Qwen3-TTS on Google Colab....

35
Emerging
1749 DarmorGamz/Youtube-Shorts-Generator

Harness OpenAI's power to effortlessly create YouTube Shorts with this...

35
Emerging
1750 aboda-dirbas/whisperclip

🎤 Enhance your voice-to-text transcriptions with WhisperClip, prioritizing...

35
Emerging
1751 indonesian-nlp/multilingual-asr

Multilingual Speech Recognition for Indonesian Languages

35
Emerging
1752 iBrammm/qwen-asr

🎙️ Implement fast, dependency-free C inference for Qwen3-ASR speech-to-text...

35
Emerging
1753 tempo-riz/deepgram_speech_to_text

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and...

35
Emerging
1754 samsad35/source-filter-vae

[SpeechCom Journal] Learning and controlling the source-filter...

35
Emerging
1755 tuanio/noisy-student-training-asr

Pytorch implementation of Noisy Student Training for Automatic Speech...

35
Emerging
1756 sebastienrousseau/akande

An innovative, open-source voice assistant powered by OpenAI's GPT-3,...

35
Emerging
1757 placebokkk/e6870

assignments for e6870 ASR class

35
Emerging
1758 GoogleCloudPlatform/text-to-speech-epg-demo

This repository contains a reference implementation demonstrating how the...

35
Emerging
1759 WelkinYang/Learn2Sing2.0

Diffusion and Mutual Information-Based Target Speaker SVS by Learning from...

35
Emerging
1760 EvilFreelancer/docker-whisper-server

whisper.cpp HTTP transcription server with OpenAI-like API in Docker

35
Emerging
1761 HaoQChen/iflytek_awaken_asr

use iflytek's technology to realize awaken and order recognition

35
Emerging
1762 efeslab/LiteASR

[EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with...

35
Emerging
1763 oleksandr-g-rock/speech2text

speech2text

35
Emerging
1764 yanorei32/winrt-tts-server

A simple Web Based Windows Runtime (WinRT) Speech Synthesis API

35
Emerging
1765 markhliu/mpt

Code repository for the book Make Python Talk

35
Emerging
1766 coqui-ai/stt-model-manager

Coqui STT Model Manager - install, manage and try out Coqui STT models from...

35
Emerging
1767 uetuluk/xcodec2-infer-lib

CPU support for xcodec2

35
Emerging
1768 BonifacioCalindoro/whatsapp-AI-assistant

AI assistant that reads you whatsapp conversations and audio messages, and...

35
Emerging
1769 jx1100370217/DFCNN-master

这是一个基于全卷积神经网络的语音识别系统

35
Emerging
1770 kwea123/Unity_live_caption

Use Google Speech-to-Text API to do real-time live stream caption on Unity!...

35
Emerging
1771 innovatorved/whisper-openai-gradio-implementation

Whisper is an automatic speech recognition (ASR) system Gradio Web UI Implementation

35
Emerging
1772 ORI-Muchim/One-Click-VITS-Training

VITS(Data Preprocessing + Whisper ASR + Text Preprocessing + Modification...

35
Emerging
1773 nexxeln/spotify-voice-control

Voice control for Spotify through the terminal

35
Emerging
1774 Jobix-Ai/Iso-Vox

STT 90% Solved — Isolate specific speakers from multi-speaker "cocktail...

35
Emerging
1775 ErcinDedeoglu/WhisperDock

Dockerized Whisper C++ speech-to-text API for easy deployment and rapid...

35
Emerging
1776 tjunttila/pdf2video

A tool for making videos from PDF presentations.

35
Emerging
1777 andriyadi/Maix-SpeechRecognizer

Speech Recognition or Wake Word detection demo, developed using Maixduino...

35
Emerging
1778 MingLunHan/CIF-PyTorch

[ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech...

35
Emerging
1779 LitoMore/mac-say

The macOS built-in `say` interface for JavaScript

35
Emerging
1780 Johnson145/voxtral_wyoming

Offline Speech-to-Text (STT) service using Mistral's Voxtral model with...

35
Emerging
1781 seven-io/js-client

Official JavaScript API Client for seven.io

35
Emerging
1782 hddevteam/speechify

🎧 Text-to-speech VS Code extension with 200+ Azure voices, TypeScript...

35
Emerging
1783 Vaibhavs10/ml-with-audio

HF's ML for Audio study group

35
Emerging
1784 rishikksh20/AudioMAE-pytorch

Unofficial PyTorch implementation of Masked Autoencoders that Listen

35
Emerging
1785 rollingstarky/Python-Voice-Assistant

A Python based Voice Assistant like Siri

35
Emerging
1786 WilleIshere/SimplerKokoro

A Python package that makes it easy to use the Kokoro voice synthesis library.

35
Emerging
1787 Lightning-Universe/Echo

Production-ready audio and video transcription app that can run on your...

35
Emerging
1788 renaudjenny/TellTime

iOS application to tell the time in the British way 🇬🇧⏰

35
Emerging
1789 LlmKira/VitsServer

🌻 VITS ONNX TTS server designed for fast inference 🔥

35
Emerging
1790 fcakyon/pywhisper

openai/whisper + extra features

35
Emerging
1791 Supremesujay/murf-voice-agent-starter

🎤 Build a low-latency voice agent with real-time TTS and STT, powered by...

35
Emerging
1792 yuanshanhua/video-dubbing

AI 驱动的视频译配工具. An AI powered tool to execute end-to-end video dubbing.

35
Emerging
1793 kauazin394/vibevoice.swift

🎤 Create low-latency text-to-speech on macOS with VibeVoice.swift,...

35
Emerging
1794 kamilc/speech-recognition

Companion repository for the blog article:...

35
Emerging
1795 moulish-dev/vita

Plug-and-play TTS integration toolkit powered by Kokoro-82M. Python + CLI...

35
Emerging
1796 maxpatiiuk/text-hoarder

A browser extension for Google Chrome. Provides reader view, saving articles...

35
Emerging
1797 IOriens/whisper-video

Generate subtitles for all the videos in a folder with OpenAI's Whisper...

35
Emerging
1798 ikfly/java-tts

java-tts 文本转语音

35
Emerging
1799 forfrt/SteerMoE

SteerMoE: Efficient Audio-Language Models with Preserved Reasoning Capabilities

35
Emerging
1800 litagin02/vits-japros-webui

日本語TTS(VITS)の学習と音声合成のGradio WebUI

35
Emerging
« Prev 1 2 3 16 17 18 19 20 68 69 70 Next »