All Voice AI Tools

6,981 tools ranked by quality score · Page 10 of 70

Showing 901–1000 of 6,981
# Tool Score Tier
901 m3hrdadfi/soxan

Wav2Vec for speech recognition, classification, and audio classification

44
Emerging
902 zslrmhb/Omniverse-Virtual-Assisstant

Audio2Face Avatar with Riva SDK functionality

44
Emerging
903 symblai/getting-started-samples

Code samples to Get started quickly with Symbl's Voice SDK and APIs:...

44
Emerging
904 ycyy/faster-whisper-webui

a gradio webui for faster whisper

44
Emerging
905 thewh1teagle/piper-rs

Use piper TTS models in Rust

44
Emerging
906 Berkeley-Speech-Group/sylber

Sylber: Syllabic Embedding Representation of Speech from Raw Audio

44
Emerging
907 earlephilhower/BackgroundAudio

Arduino library for easy, interrupt driven speech, MP3, AAC, and WAV...

44
Emerging
908 zw76859420/ASR_WORD

采用端到端方法构建声学模型,以字为建模单元,采用DCNN-CTC网络结构。

44
Emerging
909 yrom/finetune-index-tts

IndexTTS Fine-tuning notebooks

44
Emerging
910 HeyHeyChicken/NOVA-NodeJS

NOVA is a customizable voice assistant made with Node.js.

44
Emerging
911 OpenMOSS/MOSS-Speech

MOSS-Speech is a true speech-to-speech large language model without text guidance.

44
Emerging
912 huschen/kaggle_speech_recognition

Conv-LSTM-CTC speech recognition network (end-to-end), written in TensorFlow.

44
Emerging
913 coqui-ai/TTS-papers

🐸 collection of TTS papers

44
Emerging
914 tonesto7/echo-speaks

Integrate your Amazon Echo devices into your Hubitat environment to create...

44
Emerging
915 nixonyh/UnityTTS

Text to Speech in Unity.

44
Emerging
916 yerfor/SyntaSpeech

SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022;...

44
Emerging
917 syntithenai/hermod

voice services stack from audio hardware through hotword, ASR, NLU, AI...

44
Emerging
918 zycv/awesome-keyword-spotting

This repository is a curated list of awesome Speech Keyword Spotting...

44
Emerging
919 just-ai/aimybox-android-sdk

Voice assistant SDK for Android

44
Emerging
920 FelixWaweru/elevenlabs-node

Eleven Labs text to speech package for NodeJS. You can use the official...

44
Emerging
921 awsaf49/audio_classification_models

Tensorflow Audio Classification Models

44
Emerging
922 Justmalhar/open-audio

Open-Audio TTS: A robust web app leveraging OpenAI's powerful Text-to-Speech...

44
Emerging
923 rishikksh20/TFGAN

TFGAN: Time and Frequency Domain Based Generative Adversarial Network for...

44
Emerging
924 coqui-ai/TTS-recipes

🐸TTS recipes for different datasets

44
Emerging
925 aofdev/vue-pwa-speech

A Vue2 Performs synchronous speech recognition Speech to text Google Cloud...

44
Emerging
926 Rubiksman78/MonikA.I

Submod for MAS with AI based features

44
Emerging
927 keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family...

44
Emerging
928 persephone-tools/persephone

A tool for automatic phoneme transcription

44
Emerging
929 momysnow/Momy-Desk-Robot

Smart desktop robot.

44
Emerging
930 am-sokolov/videodubber

The program for automatic dubbing any video file for a lot of languages.

44
Emerging
931 oddlama/whisper-overlay

A wayland overlay providing speech-to-text functionality for any application...

44
Emerging
932 aofdev/vue-speech-streaming

A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech

44
Emerging
933 ttaoREtw/Tacotron-pytorch

A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model

44
Emerging
934 shenasa-ai/speech2text

A Deep-Learning-Based Persian Speech Recognition System

44
Emerging
935 apaar97/translate

Android app to translate text conversations, supporting 90+ languages with...

44
Emerging
936 Aratako/MioTTS-Inference

Inference server for MioTTS, a lightweight and fast LLM-based TTS model.

44
Emerging
937 nl8590687/ASRT_SDK_Java

ASRT Speech Recognition SDK for Java. 用于ASRT语音识别系统的Java SDK

44
Emerging
938 murf-ai/murf-python-sdk

Python sdk for Murf text to speech API

44
Emerging
939 jpescada/TwitterPiBot

A Python based bot for Raspberry Pi that grabs tweets with a specific...

44
Emerging
940 snakers4/open_stt

Open STT

44
Emerging
941 LSimon95/megatts2

Unoffical implementation of Megatts2

44
Emerging
942 jim-schwoebel/download_audioset

📁 This repo makes it easy to download the raw audio files from AudioSet...

44
Emerging
943 jaco-bro/diajax

Dia-JAX: A JAX port of Dia, the text-to-speech model for generating...

44
Emerging
944 jishengpeng/WavTokenizer

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second...

44
Emerging
945 dunky11/voicesmith

[WIP] VoiceSmith makes training text to speech models easy.

44
Emerging
946 subho406/TF-Speech-Recognition-Challenge-Solution

Source code of the model used in Tensorflow Speech Recognition Challenge...

44
Emerging
947 drmfinlay/pyjsgf

JSpeech Grammar Format (JSGF) compiler, matcher and parser package for Python.

44
Emerging
948 sljeff/anycast

An AI-Powered Podcast App.

44
Emerging
949 hirofumi0810/asr_preprocessing

Python implementation of pre-processing for End-to-End speech recognition

44
Emerging
950 cuinjune/text2video

A software tool that converts text to video for more engaging learning experience

44
Emerging
951 mozilla-ai/speech-to-text-finetune

Blueprint by Mozilla.ai for finetuning a Speech-To-Text model in your own language

44
Emerging
952 PABannier/bark.cpp

Suno AI's Bark model in C/C++ for fast text-to-speech generation

44
Emerging
953 gurjar1/OmniDictate

Free, open-source, real-time dictation for Windows. Runs locally (no...

44
Emerging
954 Pikurrot/whisper-gui

A simple GUI to use Whisper.

44
Emerging
955 igorshmukler/kokoro-ruslan

Kokoro Language Model Training Script for Russian (Ruslan Corpus)

44
Emerging
956 rajkishorbgp/JARVIS-AI-Assistant

JARVIS AI Assistant 🤖 A virtual assistant project inspired by Tony Stark's...

44
Emerging
957 SEPIA-Framework/sepia-stt-server

SEPIA server to support open-source speech recognition via WebSocket connection.

44
Emerging
958 matteo-convertino/vosk-build-model

How to create your own model for vosk

44
Emerging
959 finchvox/finchvox

Voice AI Observability, Elevated

44
Emerging
960 travisvn/edge-tts-extension

Chrome extension to generate free, high-quality text-to-speech using...

44
Emerging
961 lucasnewman/f5-tts-swift

Implementation of F5-TTS in Swift using MLX

44
Emerging
962 SynHub/syn-speech

Syn.Speech is a flexible speaker independent continuous speech recognition...

44
Emerging
963 rse/speechflow

Speech Processing Flow Graph

44
Emerging
964 Amirrezahmi/Zozo-Assistant

Zozo Assistant is a voice-activated chatbot that performs tasks based on...

44
Emerging
965 Azure-Samples/sonic-brief

Sonic Brief Project is an Azure-based system that transcribes and...

44
Emerging
966 neosun100/cosyvoice-docker

🎙️ CosyVoice All-in-One Docker - Production-ready TTS with Web UI, REST API...

44
Emerging
967 djmango/obsidian-transcription

Obsidian plugin to create high-quality transcriptions from markdown linked...

44
Emerging
968 isaiahbjork/expo-kokoro-onnx

Run Kokoro TTS locally on device using Expo & ONNX Runtime

44
Emerging
969 upskyy/Transformer-Transducer

PyTorch implementation of "Transformer Transducer: A Streamable Speech...

43
Emerging
970 Markfryazino/wav2lip-hq

Extension of Wav2Lip repository for processing high-quality videos.

43
Emerging
971 bawangxx/XZVoice

Free and open source text-to-speech software

43
Emerging
972 TimoBolkart/voca

This codebase demonstrates how to synthesize realistic 3D character...

43
Emerging
973 Cvandia/nonebot-plugin-fishspeech-tts

适用于nonebot2的fish-speech和fish-audio的tts插件

43
Emerging
974 isomoes/blivedm_rs

一个功能强大的 Bilibili 直播间弹幕 WebSocket 客户端 Rust 库,支持实时弹幕监控、文字转语音(TTS)和浏览器 Cookie...

43
Emerging
975 speechmatics/speechmatics-python-sdk

Python SDKs for Speechmatics APIs

43
Emerging
976 aman179102/podvoice

Local-first CLI that turns Markdown scripts into multi-speaker podcast-style...

43
Emerging
977 sovaai/sova-asr

SOVA ASR (Automatic Speech Recognition)

43
Emerging
978 antor44/livestream_video

playlist4whisper manages media streams playlists for livestream_video.sh,...

43
Emerging
979 ImNimboss/uberduck

A synchronous and asynchronous API wrapper for the UberDuck text-to-speech...

43
Emerging
980 lixiangyu890601/EasyAICC-Easy-AI-Call-Center

外呼系统,智能外呼,自动外呼系统,人工外呼,呼叫中心

43
Emerging
981 lukaszliniewicz/Pandrator

Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos...

43
Emerging
982 yl4579/StyleTTS-VC

Official Implementation of StyleTTS-VC

43
Emerging
983 opensource-spraakherkenning-nl/Kaldi_NL

Code related to the Dutch instance and user groups of the KALDI speech...

43
Emerging
984 Pankaj-Baranwal/pocketsphinx

Updated ROS bindings to pocketsphinx

43
Emerging
985 harmlessman/PAFTS

PAFTS : Library That Preprocessing Audio For TTS.

43
Emerging
986 lablab-ai/OpenAI_Whisper_Streamlit

A minimalistic automatic speech recognition streamlit based webapp powered...

43
Emerging
987 wit-ai/android-voice-demo

Example on how to build a voice-enabled Android app with Wit.ai

43
Emerging
988 meemalabs/laravel-text-to-speech

💬 A wrapper for popular TTS services to create a more simple & uniform API....

43
Emerging
989 dmisol/flexatar-virtual-webcam

Personalized Virtual Webcam for WebRTC

43
Emerging
990 felixchenfy/Speech-Commands-Classification-by-LSTM-PyTorch

Classification of 11 types of audio clips using MFCCs features and LSTM....

43
Emerging
991 aws-solutions/content-localization-on-aws

Automatically generate multi-language subtitles using AWS AI/ML services....

43
Emerging
992 yuhr/langue

A modern platform for conlanging. Currently in the planning stage.

43
Emerging
993 drien/tts-joinery

Stitch together text-to-speech over 4096 characters via the OpenAI API

43
Emerging
994 pandeydivesh15/AVSR-Deep-Speech

Google Summer of Code 2017 Project: Development of Speech Recognition Module...

43
Emerging
995 notAI-tech/IndicASR

Speeech Recognition for Indic languages.

43
Emerging
996 jojojaeger/whisper-streamlit

this master thesis project is based on OpenAI Whisper with the goal to...

43
Emerging
997 Hecate2/sukasuka-vocal-dataset-builder

すかすかアニメボカロデータセット。1st anime vocal dataset. Extract audio (vocal) files from...

43
Emerging
998 MohammedRashad/FPGA-Speech-Recognition

Expiremental Speech Recognition System using VHDL & MATLAB.

43
Emerging
999 haguro/elevenlabs-go

A Go API client library for the ElevenLabs speech synthesis platform

43
Emerging
1000 youmebangbang/TTS-dataset-tools

Automatically generates TTS dataset using audio and associated text. Make...

43
Emerging
« Prev 1 2 3 8 9 10 11 12 68 69 70 Next »