All Voice AI Tools

6,981 tools ranked by quality score · Page 15 of 70

Showing 1401–1500 of 6,981
# Tool Score Tier
1401 xenova/kokoro-web

ML-powered speech synthesis directly in your browser

39
Emerging
1402 IhorShevchuk/RHVoice-spm

A free and open source speech synthesizer with support for a lot languages...

39
Emerging
1403 zakuro-ai/asr

ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in...

39
Emerging
1404 kgnlp/allophant

A multilingual phoneme recognizer capable of generalizing zero-shot to...

39
Emerging
1405 bedriyan/speaky

Voice-to-text for macOS, powered by on-device AI. Press a hotkey, speak, and...

39
Emerging
1406 roboticslab-uc3m/speech

Text To Speech (TTS) and Automatic Speech Recognition (ASR).

39
Emerging
1407 mrtrizer/UnityPiper

Offline text to speech inside Unity

39
Emerging
1408 aqiu202/aqiu-spring-boot-starter-projects

个人封装的一些开箱即用的Spring Boot Starter组件,简单且实用,后续会根据需求进行持续扩展!

39
Emerging
1409 pnlpal/pnl-reader

PNL Reader: read quietly or read aloud

39
Emerging
1410 chrisvdev/obs-chat

Also known as CVTalk is a Twitch chat viewer made with React for use in OBS...

39
Emerging
1411 liuhaozhe6788/voice-cloning-collab

an improved version of Real-time-voice-cloning

39
Emerging
1412 hiteshsahu/Android-TTS-STT

One line solution for Android Text to speech(TTS) & Speech to Text(STT)...

39
Emerging
1413 supikiti/PNCC

A implementation of Power Normalized Cepstral Coefficients: PNCC

39
Emerging
1414 OwenTyme/voice-zero

Collection of samples suitable for use with zero-shot text to speech engines.

39
Emerging
1415 arghyasur1991/LiveTalk-Unity

LiveTalk is a unified, high-performance talking head generation system that...

39
Emerging
1416 DrewThomasson/ebook2audiobookSTYLETTS2

This simple program makes use of Calibre to convert a ebook into chapters...

39
Emerging
1417 mayeaux/generate-subtitles

Generate transcripts for audio and video content with a user friendly UI,...

39
Emerging
1418 nihui/ncnn-android-piper

ncnn android piper the fast and local neural text-to-speech engine

39
Emerging
1419 sooftware/speech-transformer

Transformer implementation speciaized in speech recognition tasks using Pytorch.

39
Emerging
1420 cool-japan/voirs

VoiRS is a cutting-edge Text-to-Speech (TTS), Voice Recognition, Sound...

39
Emerging
1421 hubendubler/gTTS.js

A Promise based Node.js/TypeScript port of the gTTS Google-Text-To-Speech...

39
Emerging
1422 gmltmd789/UnitSpeech

An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis...

39
Emerging
1423 ShawnHymel/tflite-speech-recognition

Demo for training a convolutional neural network to classify words and...

39
Emerging
1424 sldimitrov/english_learning_system

English Learning System I have developed in order to help others in...

39
Emerging
1425 zthxxx/python-Speech_Recognition

A simple example for use speech recognition baidu api with python.

39
Emerging
1426 rishikksh20/SoundStorm-pytorch

Google's SoundStorm: Efficient Parallel Audio Generation

39
Emerging
1427 ReneTode/My-AppDaemon

My apps, my helpfiles, all about AppDaemon for Home Assistant

39
Emerging
1428 darkautism/sensevoice-rs

A Rust-based, SenseVoiceSmall

39
Emerging
1429 spokestack/spokestack-python

Spokestack is a library that allows a user to easily incorporate a voice...

39
Emerging
1430 tabahi/contexless-phonemes-CUPE

pytorch model for contexless-phoneme prediction from speech audio

39
Emerging
1431 alessandroragano/scoreq

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

39
Emerging
1432 cameronking4/openai-realtime-blocks

Voice AI components using OpenAI Realtime API to copy and paste into your...

39
Emerging
1433 OlivierMary/MySuperWhisper

A global voice dictation tool for Linux using local OpenAI Whisper. Fast,...

39
Emerging
1434 flogy/gatsby-mdx-tts

🗣 Adds speech output to your Gatsby site using Amazon Polly.

39
Emerging
1435 Aivis-Project/aivmlib-web

Aivis Voice Model File (.aivm/.aivmx) Utility Library for Web

39
Emerging
1436 nodef/extra-googletts

Generate speech audio from super long text through machine (via "Google...

39
Emerging
1437 black-roland/homeassistant-yandex-speechkit

Yandex SpeechKit integration for Home Assistant providing speech-to-text and...

39
Emerging
1438 espnet/interspeech2019-tutorial

INTERSPEECH 2019 Tutorial Materials

39
Emerging
1439 balisujohn/tortoise.cpp

A ggml (C++) re-implementation of tortoise-tts

39
Emerging
1440 hans00/phonemize

Pure JS fast phonemizer with rule-based G2P prediction

39
Emerging
1441 smartherd/SpeechToText

Speech To Text in Android

39
Emerging
1442 playht/text-to-speech-api

Play.ht's Text to Speech API

39
Emerging
1443 npuichigo/voicenet

Speech synthesis platform based on tensorflow and sonnet

39
Emerging
1444 6drf21e/ChatTTS_colab

🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。

39
Emerging
1445 sovse/Rus-SpeechRecognition-LSTM-CTC-VoxForge

Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge

39
Emerging
1446 SergeyShk/Speech-to-Text-Russian

Проект для распознавания речи на русском языке на основе pykaldi.

39
Emerging
1447 keonlee9420/DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational...

39
Emerging
1448 MiniMax-AI/MiniMax-AI.github.io

The official GitHub Page for MiniMax

39
Emerging
1449 jorge-menjivar/super-stt

Super STT enables effortless voice-to-text in any application, using the...

39
Emerging
1450 CodeBySonu95/VoxSherpa-TTS

🎙️ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android ⚡...

39
Emerging
1451 sevangelatos/py-ttspico

Python svox picotts wrapper

39
Emerging
1452 ioBroker/ioBroker.sonus

Control ioBroker with voice

39
Emerging
1453 HarunoriKawano/Wav2vec2.0

Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised...

39
Emerging
1454 nature-heart-software/izabela

Your speech assistant. Communicate with text-to-speech in games, on voice...

39
Emerging
1455 LuckyHookin/edge-TTS-record

一个可以录制 Microsoft Edge 浏览器的语音合成(TTS)语音并输出为 .wav 音频的(windows平台)工具。

39
Emerging
1456 tomchang25/whisper-auto-transcribe

Auto transcribe tool based on whisper

39
Emerging
1457 orange2ai/youtube-subtitle-translator

🌐 Real-time YouTube subtitle translator browser extension. Translate...

39
Emerging
1458 TheNewC0der-24/Textonus

Voice to Text Online Notepad Professional, Accurate & Free Speech...

39
Emerging
1459 gooofy/py-picotts

Python wrappers around SVOX Pico TTS

39
Emerging
1460 1ytic/open_stt_e2e

PyTorch end-to-end speech recognition

39
Emerging
1461 LonePheasantWarrior/TalkifyTTS

云端大模型驱动的 Android 语音合成应用(TTS引擎)。支持豆包、腾讯、微软、千问等模型。An Android text-to-speech...

39
Emerging
1462 harvard-edge/multilingual_kws

Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus

39
Emerging
1463 MuGuiLin/VoiceDictation

迅飞 语音听写 WebAPI - 把语音(≤60秒)转换成对应的文字信息,让机器能够“听懂”人类语言,相当于给机器安装上“耳朵”,使其具备“能听”的功能。

39
Emerging
1464 JusperLee/Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech...

39
Emerging
1465 KernelInterrupt/whisper4dart

whisper4dart is a dart wrapper for whisper.cpp, designed to offer an...

39
Emerging
1466 bold-ronin/lira

A Voice-First AI Companion

39
Emerging
1467 jhubbardsf/svelte-speech-recognition

Speech recognition library for Svelte

39
Emerging
1468 dictate-button/dictate-button

Customizable Web Component that adds speech-to-text dictation capabilities...

39
Emerging
1469 spokestack/spokestack-android

Extensible Android mobile voice framework: wakeword, ASR, NLU, and TTS....

39
Emerging
1470 VITA-Group/Audio-Lottery

[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight,...

39
Emerging
1471 aiola-lab/drax

Drax: Speech Recognition with Discrete Flow Matching

38
Emerging
1472 GuillaumeFalourd/formulas-python

Ritchie CLI formulas in Python 🐍

38
Emerging
1473 1038lab/ComfyUI-MegaTTS

A ComfyUI custom node based on ByteDance MegaTTS3, enabling high-quality...

38
Emerging
1474 bytectlgo/edge-tts

Edge TTS is a command-line tool based on Microsoft Edge's text-to-speech...

38
Emerging
1475 evilC/HotVoice

Adds Speech Recognition support to AutoHotkey, via a C# DLL

38
Emerging
1476 alan-ai/alan-sdk-reactnative

The Self-Coding System for Your App — Alan AI SDK for React Native

38
Emerging
1477 jamditis/audiobash

Voice-controlled terminal for developers. Speak commands, execute instantly.

38
Emerging
1478 khanld/ASR-Wav2vec-Finetune

:zap: Finetune Wa2vec 2.0 For Speech Recognition

38
Emerging
1479 speechsuper/SpeechSuper-API-Samples

Deep learning based speech and pronunciation assessment API for 8 languages.

38
Emerging
1480 jordicor/santa-claus-is-calling

A magical Christmas experience where Santa Claus (AI with Santa's voice)...

38
Emerging
1481 QuantiusBenignus/blurt

Gnome shell extension for accurate OFFLINE speech to text input in Linux...

38
Emerging
1482 xingchensong/Speech-Transformer-tf2.0

transformer for ASR-systerm (via tensorflow2.0)

38
Emerging
1483 MarkParker5/STARK

S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit

38
Emerging
1484 andi611/ZeroSpeech-TTS-without-T

A Pytorch implementation for the ZeroSpeech 2019 challenge.

38
Emerging
1485 openconcerto/MisterWhisper

Push to talk voice recognition using Whisper

38
Emerging
1486 piotrkawa/deepfake-whisper-features

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"

38
Emerging
1487 askrella/speech-rest-api

Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)

38
Emerging
1488 AA-Factory/aafactory-prototype

⚡ AI Avatar Factory is an interface for creating and managing AI avatars. ⚡

38
Emerging
1489 iceychris/LibreASR

:speech_balloon: An On-Premises, Streaming Speech Recognition System

38
Emerging
1490 kaushiknishchay/ComfyUI-Qwen3-ASR

ComfyUI nodes for Qwen3-ASR (0.6B/1.7B) and ForcedAligner. Supports...

38
Emerging
1491 rohit-lakhanpal/ai-hackathon-starter-kit

This project has been created to make AI accessible and easy for everyone....

38
Emerging
1492 twangodev/speak-mintlify

Automatically generate voice narration for your Mintlify documentation.

38
Emerging
1493 SpeechColab/Leaderboard

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform...

38
Emerging
1494 gorkemkaramolla/whisper-run

Faster Whisper with Speaker Diarization

38
Emerging
1495 Saganaki22/ComfyUI-KittenTTS

😻 A simple ComfyUI custom node for KittenTTS - an ultra-lightweight...

38
Emerging
1496 botbahlul/crx-live-translate

Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video...

38
Emerging
1497 ElmTran/praises

Praises is a text-to-speech tool that can help you read text easily.

38
Emerging
1498 AkishinoShiame/Chinese-Speech-Emotion-Datasets

Datasets of A Deep Convolutional Neural Network Based Virtual Elderly...

38
Emerging
1499 hcy71o/SC-CNN

SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker...

38
Emerging
1500 jingangdidi/voice_clone

An OpenVoice-based voice cloning tool, single executable file (~14M),...

38
Emerging
« Prev 1 2 3 13 14 15 16 17 68 69 70 Next »