All Voice AI Tools

6,981 tools ranked by quality score · Page 14 of 70

Showing 1301–1400 of 6,981
# Tool Score Tier
1301 sayksii/Aria

ARIA - AI Realtime Intelligent Audio | Universal real-time AI subtitles for Windows

40
Emerging
1302 pschatzmann/arduino-espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than...

40
Emerging
1303 RapidAI/RapidTTS

A cross platform implementation of Text-to-Speech based on ONNXRuntime.

40
Emerging
1304 yeyupiaoling/VITS-Pytorch

本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,...

40
Emerging
1305 seanghay/speechviewer

A quick audio dataset viewer

40
Emerging
1306 1ytic/pytorch-edit-distance

Levenshtein edit-distance on PyTorch and CUDA

40
Emerging
1307 soupslurpr/Transcribro

Private and on-device speech recognition keyboard and service for Android.

40
Emerging
1308 AndroidMaryTTS/AndroidMaryTTS

Android MARY TTS - an open-source, offline HMM-Based text-to-speech...

40
Emerging
1309 RafalWilinski/serverless-medium-text-to-speech

🔊 Serverless-based, text-to-speech service for Medium articles

40
Emerging
1310 DmitryRyumin/INTERSPEECH-2023-24-Papers

INTERSPEECH 2023-2024 Papers: A complete collection of influential and...

40
Emerging
1311 mmpneo/curses

Speech to Text and KB input captions for OBS, VRChat, Twitch chat and Discord

40
Emerging
1312 saidsef/tika-document-to-text

Apache Tika extract text and metadata from any document format with this...

40
Emerging
1313 Detoxfox4234/Qwen3-Voice-Factory

Local, portable GUI for Qwen3-TTS. Optimized for NVIDIA RTX 50 Series (CUDA...

40
Emerging
1314 gheyret/UQSpeechDataset

Uyghur Single Speaker Speech Dataset. ウイグル語音声データセット

40
Emerging
1315 TeaPoly/Conformer-Athena

Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena.

40
Emerging
1316 jscrane/TTS

Arduino Text-to-Speech Library

40
Emerging
1317 p1an-lin-jung/teochew-g2p

这是一个潮州话文本端的处理工具和正字标准,主要为潮州方言的语音合成服务

40
Emerging
1318 foamliu/Listen-Attend-Spell-v2

PyTorch implementation of Listen Attend and Spell Automatic Speech Recognition (ASR).

40
Emerging
1319 pymike00/YouTube-Tutorials

:open_file_folder: Source Code for (some of) the Programming Tutorials from...

40
Emerging
1320 danielclough/vibevoice-rs

Rust implementation of VibeVoice text-to-speech with voice cloning and...

40
Emerging
1321 lancejames221b/jarvis-voice

OpenJarvis — Real-time AI voice assistant for Discord. Talk to the same...

40
Emerging
1322 Kajitsy/Emilia

Emilia - Desktop Character.AI Client

40
Emerging
1323 RoySheffer/im2wav

Official implementation of the pipeline presented in I hear your true...

40
Emerging
1324 izwi-ai/izwi

On-device AI engine for transcription, TTS, and voice workflows.

40
Emerging
1325 zw76859420/ASR_Syllable

基于卷积神经网络的语音识别声学模型的研究

40
Emerging
1326 Baidu-AIP/speech-tts-cors

百度语音 语音合成 跨域demo以及支持库

40
Emerging
1327 hyeonsangjeon/computing-Korean-STT-error-rates

STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지

40
Emerging
1328 speechio/BigCiDian

Pronunciation lexicon covering both English and Chinese languages for...

40
Emerging
1329 speechly/speechly

Client libraries, examples and demos of Speechly API for the Web.

40
Emerging
1330 SadeghKrmi/pertts-streamlit

Persian text-to-speech streamlit interface

40
Emerging
1331 adi-gov-tw/Taiwan-Tongues-ASR-CE

Taiwan Tongues ASR CE 是一個開源語音辨識(Automatic Speech Recognition,...

40
Emerging
1332 Warma10032/easytts

打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人...

40
Emerging
1333 dusty-nv/jetson-voice

ASR/NLP/TTS deep learning inference library for NVIDIA Jetson using PyTorch...

40
Emerging
1334 Pictalk-speech-made-easy/pictalk-frontend

Pictalk is an open-source application designed to assist individuals with...

40
Emerging
1335 reybahl/Assistant

A machine learning powered, voice-based virtual assistant for Raspberry Pi....

40
Emerging
1336 hcy71o/SNAC

Unofficial Pytorch implementation of SNAC: Speaker-normalized affine...

40
Emerging
1337 adelacvg/ttts

Train the next generation of TTS systems.

40
Emerging
1338 agent87/RW-DEEPSPEECH-API

An end to end deep speech REST API containing speech to text and text speech...

40
Emerging
1339 double22a/asr_nlp_paper_code

Papers of ASR, Tools of ASR

40
Emerging
1340 abus-aikorea/kara-audio

Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports...

40
Emerging
1341 unilight/jatts

JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit

40
Emerging
1342 ShadowForests/VoiceToSpeech

Live speech recognition to synthesized speech with hundreds of voices, TTS,...

40
Emerging
1343 stts-se/wikispeech-server

The main API for Wikispeech

40
Emerging
1344 HurroWorld/text-to-audio2face

Web interface to convert text to speech and route it to an Audio2Face...

40
Emerging
1345 gillesdemey/google-speech-v2

:speech_balloon: Reverse Engineering Google's Speech To Text API (v2)

40
Emerging
1346 maum-ai/nuwave2

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling...

40
Emerging
1347 itspyguru/Tkinter-Applications

A collection of small tkinter apps made by me

40
Emerging
1348 elimu-ai/vitabu

📚 Android application for reading storybooks and expanding word vocabulary.

40
Emerging
1349 nerdaxic/glados-voice-assistant

DIY Voice Assistant based on the GLaDOS character from Portal video game...

40
Emerging
1350 n0th1ng-else/voice-to-text-bot

Telegram bot that converts Voice messages into text

40
Emerging
1351 mapluisch/OpenAI-Text-To-Speech-for-Unity

Implementation of OpenAI's Text-To-Speech in Unity. Synthesize any text and...

40
Emerging
1352 tarun7r/SpeechAlgo

A Comprehensive Speech Processing Algorithms Library for research and production use

40
Emerging
1353 belambert/asr-tools

Libraries and scripts for manipulating and handling ASR output/n-bests/etc.

40
Emerging
1354 racai-ai/RobinASR

Romanian Automatic Speech Recognition from the ROBIN project

40
Emerging
1355 rishikksh20/Avocodo-pytorch

Avocodo: Generative Adversarial Network for Artifact-free Vocoder

40
Emerging
1356 myuan19/voiceInput

Windows AI 语音输入🎙 — 按快捷键说话即输入,支持润色。摆脱打字限制,实现无拘束、高效率的表达。

40
Emerging
1357 codyw912/open-asr-server

OpenAI-compatible ASR server with pluggable local backends (Parakeet,...

40
Emerging
1358 daanzu/speech-training-recorder

Simple GUI application to help record audio dictated from given text...

40
Emerging
1359 deepgram-starters/django-voice-agent

Get started using Deepgram's Voice Agent with this Django demo app

40
Emerging
1360 FR33TR1ST/VoiceAssistant

A VoiceAsistant with WhisperAI speech recognition

40
Emerging
1361 Labmem-Zhouyx/CDFSE_FastSpeech2

The Official Implementation of “Content-Dependent Fine-Grained Speaker...

40
Emerging
1362 sophiefy/StellaVoiceChanger

Deep-learning-based voice changer, supporting local inference.

40
Emerging
1363 mrtozner/vox

Local voice AI framework for Rust. Whisper + LLM + TTS with no cloud dependencies.

40
Emerging
1364 JollyToday/GhostCut-auto_video_translation

auto video translation-video translator can auto translate video hard...

39
Emerging
1365 cvqluu/TDNN

Time delay neural network (TDNN) implementation in Pytorch using unfold method

39
Emerging
1366 TranscribeJs/transcribe.js

Monorepo for Transcribe.js

39
Emerging
1367 mrf345/django_gtts

Django app extension to add gTTS google text-to-speech

39
Emerging
1368 westonruter/spoken-word

Spoken Word

39
Emerging
1369 felipefacundes/brasiltts

Brasil TTS é um conjunto de sintetizadores de voz, em português do Brasil,...

39
Emerging
1370 Yazdi9/TTS-MultiLingual

Text To Speech Multilingual Support (+20 Language)

39
Emerging
1371 john-carroll-sw/coffee-chat-voice-assistant

Coffee Chat Voice Assistant is a voice-driven ordering system powered by...

39
Emerging
1372 WismutHansen/READ2ME

Turn text from websites into spoken audio with edge-tts, F5, etc. and save...

39
Emerging
1373 jcrodriguez1989/heyshiny

Package: New `shiny` input that translates audio to text

39
Emerging
1374 xifan2333/fcitx5-vinput

Local offline voice input plugin for Fcitx5

39
Emerging
1375 34j/neural-source-filter

Python package for NSF and NSF-HiFi-GAN (unofficial)

39
Emerging
1376 royshil/cloudvocal

Cloud AI live transcription and translation service plugin

39
Emerging
1377 syhw/wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on...

39
Emerging
1378 thewh1teagle/phonikud-tts

phonikud-tts - text to speech in Hebrew

39
Emerging
1379 deepgram-starters/node-text-to-speech

Get started using Deepgram's Text-to-Speech with this Node demo app

39
Emerging
1380 AIFSH/ComfyUI-GPT_SoVITS

a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now

39
Emerging
1381 zhangzijie-pro/Speaker-Verification

Dual-model speech AI toolkit for speaker verification and speaker-aware...

39
Emerging
1382 google-research-datasets/TextNormalizationCoveringGrammars

Covering grammars for English and Russian text normalization

39
Emerging
1383 voicegain/python-sdk

Python SDK for working with Voicegain Speech-to-Text

39
Emerging
1384 Speaker-Identification/You-Only-Speak-Once

Deep Learning - one shot learning for speaker recognition using Filter Banks

39
Emerging
1385 stefantaubert/en-tts

Command-line interface and Python library for synthesizing English texts into speech.

39
Emerging
1386 AudioLLMs/AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

39
Emerging
1387 saurabhshri/CCAligner

🔮 Word by word audio subtitle synchronisation tool and API. Developed under...

39
Emerging
1388 Executedone/Chinese-FastSpeech2

基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏

39
Emerging
1389 TrevorS/qwen3-tts-rs

Rust implementation of Qwen3-TTS speech synthesis

39
Emerging
1390 megaease/easevoice-trainer

EaseVoice Trainer is a simple and user-friendly voice cloning and speech...

39
Emerging
1391 kaieberl/paper2speech

Convert any english paper or scientific book to audio

39
Emerging
1392 SungFeng-Huang/Meta-TTS

Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More...

39
Emerging
1393 tugstugi/pytorch-speech-commands

Speech commands recognition with PyTorch | Kaggle 10th place solution in...

39
Emerging
1394 pkozul/ha-tts-bluetooth-speaker

TTS Bluetooth Speaker for Home Assistant

39
Emerging
1395 Jaffe2718/Microphone-Text-Input

A fabric mod that can recognize speech as text messages and automatically...

39
Emerging
1396 JasonLovesDoggo/Flow

Native MacOS dictation that captures audio, transcribes speech, and formats...

39
Emerging
1397 ORI-Muchim/Efficient-Speech

Lightweight Korean TTS Model based on FastSpeech2

39
Emerging
1398 dokterbob/macos-speech-server

Local, fast and efficient Speech to Text (STT) and Text to Speech (TTS) on...

39
Emerging
1399 awexandrr/audioWhisper

Listen to any audio stream on your machine and print out the transcribed or...

39
Emerging
1400 asticode/go-astibob

Golang framework to build an AI that can understand and speak back to you,...

39
Emerging
« Prev 1 2 3 12 13 14 15 16 68 69 70 Next »