All Voice AI Tools

6,981 tools ranked by quality score · Page 8 of 70

Showing 701–800 of 6,981
# Tool Score Tier
701 YoavRamon/awesome-kaldi

This is a list of features, scripts, blogs and resources for better using...

47
Emerging
702 modelscope/ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained...

47
Emerging
703 Jackiexiao/zhtts

A demo of zh/Chinese Text to Speech system run on CPU in real time. 中文实时语音合成系统Demo

47
Emerging
704 Purple-Horizons/openclaw-voice

🦞 Open-source browser-based voice chat for AI assistants. Self-hosted,...

47
Emerging
705 wildminder/ComfyUI-VoxCPM

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

47
Emerging
706 AppDevGuy/OSSSpeechKit

OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.

47
Emerging
707 mdangschat/ctc-asr

End-to-end trained speech recognition system, based on RNNs and the...

47
Emerging
708 keonlee9420/Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive...

47
Emerging
709 modelscope/KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we...

47
Emerging
710 GeekyWizKid/video_processing_service

Video Processing Service is an automated video processing service that...

47
Emerging
711 soobinseo/Tacotron-pytorch

Pytorch implementation of Tacotron

47
Emerging
712 PhamHuynhAnh16/Vietnamese-RVC

Dự án công cụ chuyển đổi giọng nói dành cho người Việt

47
Emerging
713 rishikksh20/AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice

47
Emerging
714 madhavmk/Noise2Noise-audio_denoising_without_clean_training_data

Source code for the paper titled "Speech Denoising without Clean Training...

47
Emerging
715 Umesh-01/Python-Assistant

Python Assistant (PA) is a voice command based assistant service written in...

47
Emerging
716 Deepest-Project/MelNet

Implementation of "MelNet: A Generative Model for Audio in the Frequency Domain"

47
Emerging
717 open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation....

47
Emerging
718 xue-fei/sherpa-onnx-unity

sherpa-onnx in unity

47
Emerging
719 chenliangrui/EasyMrcp

欢迎使用EasyMrcp! EasyMrcp使用java编写,目前提供了多种不同的asr和tts的集成,做到真正简单使用ASR和TTS。...

47
Emerging
720 travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging...

47
Emerging
721 alexruperez/SpeechRecognizerButton

UIButton subclass with push to talk recording, speech recognition and...

47
Emerging
722 themanyone/whisper_dictation

Private voice keyboard, AI chat, images, webcam, recordings, voice control...

47
Emerging
723 nari-labs/dia2

TTS model capable of streaming conversational audio in realtime.

47
Emerging
724 rishikksh20/VocGAN

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested...

47
Emerging
725 maxwellobi/Android-Speech-Recognition

Continuous speech recognition library for Android with options to use...

46
Emerging
726 slp-rl/aero

This repo contains the official PyTorch implementation of "Audio Super...

46
Emerging
727 Plachtaa/VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model....

46
Emerging
728 leaonline/easy-speech

🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no...

46
Emerging
729 freewym/espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

46
Emerging
730 atomiechen/FunASR-Client

Really easy-to-use Python client for FunASR runtime server.

46
Emerging
731 pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with...

46
Emerging
732 jianchang512/clone-voice

A sound cloning tool with a web interface, using your voice or any sound to...

46
Emerging
733 dhruvyad/uttertype

Short code for dictation using OpenAI Whisper for transcription.

46
Emerging
734 chenkui164/FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。...

46
Emerging
735 shashikg/WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting...

46
Emerging
736 jpreprocess/jbonsai

Voice synthesis library for Text-to-Speech applications (Currently HTS...

46
Emerging
737 acoti/articulate.js

A jQuery plugin that lets the browser speak to you.

46
Emerging
738 ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous...

46
Emerging
739 Evil0ctal/Fast-Powerful-Whisper-AI-Services-API

⚡ 一款用于自动语音识别 (ASR)、翻译的高性能异步 API。不需要购买Whisper...

46
Emerging
740 duncan3dc/speaker

A PHP library to convert text to speech using various web services

46
Emerging
741 Kardbord/hfapigo

Unofficial (Golang) Go bindings for the Hugging Face Inference API

46
Emerging
742 neosapience/mlp-singer

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing...

46
Emerging
743 joelpurra/talkie

Text-to-speech browser extension button. Select text on any web page, and...

46
Emerging
744 Quantatirsk/funasr-api

Speech recognition API service powered by FunASR and Qwen-ASR, supporting 52...

46
Emerging
745 yanorei32/discord-tts

TTS Discord Bot [VOICEROID, VOICEVOX, AivisSpeech, kttsproject, WinRT, and...

46
Emerging
746 SARIT42/lipsyncr

LipSyncr is a lip reading web app based on the LipNet model that can lip...

46
Emerging
747 arihanv/Shush

Shush is an app that deploys a WhisperV3 model with Flash Attention v2 on...

46
Emerging
748 mark-rez/TikTok-Voice-TTS

Simple Python script to interact with the TikTok TTS Voices.

46
Emerging
749 eel-brah/kokorodoki

Natural-sounding Text-to-Speech App that fits anywhere. Fast, Real-Time and flexible.

46
Emerging
750 AmphionTeam/FlexiCodec

[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

46
Emerging
751 deepgram-starters/flask-transcription

Get started using Deepgram's Pre-Recorded Transcription with this Flask demo app

46
Emerging
752 DePasqualeOrg/mlx-swift-audio

Swift tools for text to speech (TTS) and speech to text (STT) powered by MLX

46
Emerging
753 kkoutini/PaSST

Efficient Training of Audio Transformers with Patchout

46
Emerging
754 eigenpunk/ComfyUI-audio

some generative audio tools for ComfyUI

46
Emerging
755 npuichigo/waveglow

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network...

46
Emerging
756 d4n3436/Fergun

A utility Discord bot written in C# using Discord.Net

46
Emerging
757 FireRedTeam/FireRedASR2S

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc...

46
Emerging
758 pinguy/kokoro-tts-addon

Local neural TTS for Browsers: fast, expressive, and offline—runs on modest hardware.

46
Emerging
759 VideotronicMaker/LM-Studio-Voice-Conversation

Python app for LM Studio-enhanced voice conversations with local LLMs. Uses...

46
Emerging
760 eduardolat/kokoro-web

🔊 Kokoro Web: Free AI text-to-speech, online or self-hosted, OpenAI compatible!

46
Emerging
761 ai-adv-lab/deepspeech.mxnet

A MXNet implementation of Baidu's DeepSpeech architecture

46
Emerging
762 deterministic-algorithms-lab/Cross-Lingual-Voice-Cloning

Tacotron 2 - PyTorch implementation with faster-than-realtime inference...

46
Emerging
763 pritishyuvraj/Voice-Conversion-GAN

Voice Conversion using Cycle GAN's For Non-Parallel Data

46
Emerging
764 sipeed/Maix-Speech

Maix Speech AI lib, a fast and small speech lib running on embedded devices,...

46
Emerging
765 halfzm/v2vt

video to video translation with voice clone and lip...

46
Emerging
766 phatjkk/SpeakIt_Vietnamese_TTS

Vietnamese Text-to-Speech on Windows Project (zalo-speech)

46
Emerging
767 cvqluu/simple_diarizer

Simplified diarization pipeline using some pretrained models - audio file to...

46
Emerging
768 cpfair/quran-align

Word-accurate timestamps for Qur'anic audio.

46
Emerging
769 smeetrs/deep_avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

46
Emerging
770 Open-Speech-EkStep/vakyansh-models

Open source speech to text models for Indic Languages

46
Emerging
771 d4n3436/GTranslate

A collection of free translation APIs (Google Translate, Bing Translator,...

46
Emerging
772 ayutaz/piper-plus

Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT) with VITS...

46
Emerging
773 gtreshchev/RuntimeSpeechRecognizer

Cross-platform, real-time, offline speech recognition plugin for Unreal...

46
Emerging
774 HumeAI/hume-react-sdk

Packages for using Hume AI and React

46
Emerging
775 JoelShine/JARVIS-AI-ASSISTANT

A true Artificial Intelligent Assistant with ALICE as backend and offline...

46
Emerging
776 goxr3plus/java-google-speech-api

🙊 Speech Recognition , Text To Speech , Google Translate

46
Emerging
777 tarun7r/Vocal-Agent

Cascading voice assistant combining real-time speech recognition, AI...

46
Emerging
778 HardCodeDev777/UnityNeuroSpeech

The world’s first game framework that lets you talk to AI in real time —...

46
Emerging
779 keonlee9420/Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional,...

46
Emerging
780 joethei/obsidian-tts

Text to speech for Obsidian. Hear your notes.

46
Emerging
781 patrickenfuego/Chapterize-Audiobooks

Split a single, monolithic mp3 audiobook file into chapters using Machine...

46
Emerging
782 ide8/tacotron2

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

46
Emerging
783 atomicoo/FCH-TTS

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese,...

46
Emerging
784 Open-Speech-EkStep/vakyansh-wav2vec2-experimentation

Repository containing experimentation platform on how to train, infer on...

46
Emerging
785 tover0314-w/opentypeless

Talkmore with Opentypeless. Type with your voice. Anywhere. Talk -...

46
Emerging
786 themanyone/voice_typing

State-of-the-art offline (or networked) voice typing everywhere + text...

46
Emerging
787 coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

46
Emerging
788 George0828Zhang/torch_cif

A fast parallel PyTorch implementation of the "CIF: Continuous...

46
Emerging
789 PaddlePaddle/Parakeet

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer...

46
Emerging
790 ishandutta2007/Awesome-Text-to-Speech

🎤 A curated list of the latest and most influential tools, models, and...

46
Emerging
791 oliverguhr/wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

46
Emerging
792 AdroitAnandAI/Indian-Accent-Speech-Recognition

Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models...

46
Emerging
793 yl4579/StyleTTS

Official Implementation of StyleTTS

46
Emerging
794 BolajiAyodeji/chat-with-siri

🤖 A text-to-speech chatbot built using Nextjs, OpenAI, and ElevenLabs.

46
Emerging
795 HadrienGardeur/web-speech-recommended-voices

A list of recommended voices for the Web Speech API

46
Emerging
796 undertheseanlp/automatic_speech_recognition

Vietnamese Automatic Speech Recognition

46
Emerging
797 Chris10M/Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to...

46
Emerging
798 alexram1313/text-to-speech-sample

Python3 Text to Speech Video Sample

46
Emerging
799 libdriver/ld3320

LD3320 full-featured driver library for general-purpose MCU and Linux.

46
Emerging
800 HKoon/ChatTTS-OpenVoice

Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your...

46
Emerging
« Prev 1 2 3 6 7 8 9 10 68 69 70 Next »