All Voice AI Tools

6,981 tools ranked by quality score · Page 24 of 70

Showing 2301–2400 of 6,981
# Tool Score Tier
2301 BayramAnnakov/gmail-to-podcast

Transform Gmail newsletters into AI-generated podcast conversations using...

30
Emerging
2302 adelacvg/detail_tts

All generative model in one for better TTS model

30
Emerging
2303 jonelo/jAdapterForNativeTTS

A simple pure Java library that allows you to use the native Text To Speech...

30
Emerging
2304 daslearning-org/text-to-speech-offline

A lightweight cross-platform Text-To-Speech application which works on...

30
Emerging
2305 FOLLGAD/reddit-video-maker

AI video content creation before it was cool

30
Emerging
2306 black-roland/homeassistant-salutespeech

SaluteSpeech integration for Home Assistant providing speech-to-text and...

30
Emerging
2307 tts-hub/monotonic_alignment_search

Monotonically align text and speech

30
Emerging
2308 botbahlul/js-live-audio-video-translate

HTML Web template that can RECOGNIZE any live audio/video streaming (using...

30
Emerging
2309 Otosaku/OtosakuTTS-iOS

Swift library for offline text-to-speech synthesis on iOS/macOS. Generate...

30
Emerging
2310 clarinsi/Slovene_ASR_e2e

Automatic Speech Recognition tool

30
Emerging
2311 loushou/flutter_tts_improved

A fork of the Flutter_TTS (https://github.com/dlutton/flutter_tts) plugin,...

30
Emerging
2312 Julia-Roman/pepega-tts

Discord bot for Google and Polly Text-to-Speech

30
Emerging
2313 jfainberg/lattice_combination

Lattice combination algorithm to combine inaccurate transcripts with...

30
Emerging
2314 linto-ai/linto-diarization

Speaker diarization service

30
Emerging
2315 chameleon-ai/vevo

Simple GUI for Amphion Vevo

30
Emerging
2316 rshahamiri/SpeechVision

Speech Vision (SV) is a Dysarthric Speech Recognition System that adopts a...

30
Emerging
2317 daanzu/kaldi_ag_training

Docker image and scripts for training finetuned or completely personal Kaldi...

30
Emerging
2318 SkyDocs/speaker-identification

Speaker Identification using Neural Net.

30
Emerging
2319 valeriorlandini/sonus

A Max/MSP package for sound experimentation and algorithmic composition

30
Emerging
2320 frrobledo/AutoDub

An advanced AI-powered tool that automatically translates and dubs YouTube...

30
Emerging
2321 sahu-adarsh/intervyu

Practice job interviews with Neerja, an AI interviewer powered by Claude....

30
Emerging
2322 daswer123/silero-tts-enhanced

Silero TTS Enhanced is a Python library that enhances the original Silero...

30
Emerging
2323 sshh12/Recording-Bot

A bot built to record and transcribe audio fragments from Discord.

30
Emerging
2324 aws-samples/amazon-transcribe-email-workflow

An Amazon Transcribe demo for "speech-to-text" conversion performed through...

30
Emerging
2325 SaptakBhoumik/easySpeech

easySpeech is an open-source Python wrapper for google speech to text API...

30
Emerging
2326 OPEXGroup/ITCC.YandexSpeechKitClient

Cross-platform client for Yandex SpeechKit Cloud API

30
Emerging
2327 thc1006/whisper-colab-tpu-transcriber

High-performance Google Colab Notebook for fast & accurate audio...

30
Emerging
2328 geekgirljoy/PHP

Examples of my PHP Code

30
Emerging
2329 Abhishek-op/SR

💡Kivy-android speech recognition

30
Emerging
2330 cmsflash/deep-learning-sota

State-of-the-art results for deep learning tasks in various fields.

30
Emerging
2331 zsl24/Tacotron2-Mandarin-HiFiGAN

Implementation of TTS with combination of Tacotron2 and HiFi-GAN

30
Emerging
2332 PRITHIVSAKTHIUR/Vision-to-VibeVoice-en

A Gradio-based demo for end-to-end vision-to-speech inference: Extract text...

30
Emerging
2333 robmsmt/SpeechLoop

Many ASRs under one roof. With Benchmarking... answering the question. What...

30
Emerging
2334 JSON2Video/json2video-nodejs-sdk

Create videos programmatically in the cloud from NodeJS: add watermarks,...

30
Emerging
2335 NICEElevateAI/ElevateAIDotNetSDK

.Net core 6 SDK for ElevateAI

30
Emerging
2336 liou666/audiread

📻 A simple and user-friendly online TTS tool. (简单易用的在线文本转语音工具)

30
Emerging
2337 brailcom/speechd-el

Emacs speech and Braille output interface

30
Emerging
2338 ARAI-Telegram/teledash-backend-processing

Optional AI-powered features of Teledash, an open-source software for...

30
Emerging
2339 Audio-WestlakeU/UMA-ASR

This repository is the official implementation of unimodal aggregation (UMA)...

30
Emerging
2340 IndieCoderMM/smart-one-ai

🤖 AI assistant that can listen to user input and provide responses. It...

30
Emerging
2341 revsic/speechset

Numpy-librosa implementation of Speech dataset pipeline

30
Emerging
2342 abinashmeher999/voice-data-extract

A command line interface to combine text information from subtitles with...

30
Emerging
2343 jumon/pywer

A simple Python package to calculate word error rate (WER).

30
Emerging
2344 mikopbx/ModuleRHVoice

Text to speech voice generator by the RHVoice algoritm

30
Emerging
2345 tcsenpai/audiocoqui

A multilingual tool to convert PDF ebooks to audiobooks using XTTS v2 TTS...

30
Emerging
2346 eliangerard/simple-tts-mp3

Converts text to mp3 audio using google-tts-api, it hasn't a limit

30
Emerging
2347 ameerbadri/twilio-asr-realtime-dashboard

Twilio ASR and Intent Realtime Dashboard

30
Emerging
2348 nico-byte/whisper-web

The Whisper Web Transcription Server is a Python-based real-time...

30
Emerging
2349 rapidaai/rapida-python

Open-source Python SDK for real-time Voice AI, voice agents, streaming...

30
Emerging
2350 nemoramo/acoustic_model

This is a sub-repository in building to create acoustic model in Mandarin...

30
Emerging
2351 ndenicolais/SpeechAndText

Android application built with Kotlin and Jetpack Compose that shows how to...

30
Emerging
2352 twirapp/silero-tts-api-server

This is a simple server that uses Silero models to convert text to audio...

30
Emerging
2353 jorcelinojunior/whisper-vtt2srt

A robust WebVTT to SRT converter optimized for AI transcriptions (Whisper,...

30
Emerging
2354 LibraryOfCongress/speech-to-text-viewer

AWS Transcribe evaluation pipeline: bulk-process audio files and view the results

30
Emerging
2355 acyclics/speech-to-speech-translator

Enables a device to input speech from a microphone, translate speech to a...

30
Emerging
2356 build-with-groq/groq-voice-agent-template

A real-time voice AI agent built with Groq API that enables natural voice...

30
Emerging
2357 fano2458/Zhadiger-Kazakh-Language-AI

AI services project "Zhadiger" for Kazakh Language developed using NVIDIA...

30
Emerging
2358 Martouta/speech_processor

Speech-to-text from videos and audios (including youtube and tiktok links)

30
Emerging
2359 empowerai/fs-middlelayer-api

US Forest Service ePermit API

30
Emerging
2360 Adibian/ResGrad

Unofficial implementation of ResGrad: Residual Denoising Diffusion...

30
Emerging
2361 nonwill/GoldenDict-OCR

GoldenDict++: Optimizations for faster dictionary loading and searching,...

30
Emerging
2362 kurianbenoy/malayalam_asr_benchmarking

A study to benchmark whisper based ASRs in Malayalam

30
Emerging
2363 lottev1991/Project-AIdol-Public-English-Dataset

Public female English corpus used for Project AI❤dol

30
Emerging
2364 avarayr/yap-for-cursor

Yap for Cursor - Voice To Text integration for Cursor IDE

30
Emerging
2365 clloret/speaking-practice

An Android application to practice English pronunciation

30
Emerging
2366 amscotti/hn-podcaster

The HackerNews Podcaster is a JavaScript application that utilizes the power...

30
Emerging
2367 parthgupta1208/VoiceCraft

Voice Craft is a desktop AI assistance tool designed to help people with...

30
Emerging
2368 rafaelvalle/asrgen

Attacking Speaker Recognition with Deep Generative Models

30
Emerging
2369 I5UCC/VRCTextboxSTT

A SpeechToText application that uses OpenAI's whisper via faster-whisper to...

30
Emerging
2370 reyniel26/bleepy

Bleepy is a Python program that can block Tagalog and English profanity in...

30
Emerging
2371 qiujiali/lattice_rnn

Bi-directional Lattice Recurrent Neural Networks for Confidence Estimation

30
Emerging
2372 andi611/Conditional-SpecGAN-Tensorflow

Text-to-Speech Synthesis by Generating Spectrograms using Generative...

30
Emerging
2373 lars76/forced-alignment-chinese

Mandarin Chinese audio datasets aligned with Montreal Forced Aligner

30
Emerging
2374 Yeti47/Vosk4Unity

Vosk4Unity is a module for the Unity Engine that provides a simple way to...

30
Emerging
2375 xeden3/MSSpeechServer

MSSpeechServer is a REST server based on the Microsoft Speech Platform that...

30
Emerging
2376 doubleZ0108/Human-Computer-Interaction

Human-Computer Interaction | Tongji Univ. SSE Course Projects

30
Emerging
2377 deepily/genie-in-the-box

Genie in the Box: Distill Whisper STT => Mistral-7B =>...

30
Emerging
2378 EuleMitKeule/speaker-recognition

Speaker recognition service for Home Assistant using voice embeddings. Train...

30
Emerging
2379 jame25/Piper-Tray

Piper Tray is a lightweight system tray utility written in C# for use with Piper TTS.

30
Emerging
2380 scottgl9/openclaw-matrix-voice

Matrix voice call bot with LiveKit, Whisper STT, and Chatterbox TTS,...

30
Emerging
2381 zhihanyang2022/gender-audio-classification

A speaker gender classifier. MFC feature engineering and a pre-trained...

30
Emerging
2382 nezhar/speech-condenser

A tool for summarizing dialogues from videos or audio

30
Emerging
2383 FomTarro/word-salad

Twitch TTS redeem that uses sentence mixing instead of synthesis.

30
Emerging
2384 EmZod/Speak-Turbo

Ultra-fast local TTS for AI agents. ~90ms to first sound.

30
Emerging
2385 asaddi/f5-tts-serve

A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful...

30
Emerging
2386 slackr31337/wyoming-piper-gpu

Wyoming Piper docker container with GPU support for Home-Assistant

30
Emerging
2387 shafaypro/PYSHA

A Simple Virtual Assistant Build in Python 3.5

30
Emerging
2388 NONAN23x/WhisperingNova

An AI voice changer harnessing the power of Open AI and VoiceVox for...

30
Emerging
2389 MiguelsPizza/local-transcription-mcp--parakeet-tdt-0.6b-v2--

Local MCP server that converts and transcribes video and audio files 100% on device

30
Emerging
2390 Rishav-Agarwal/Translate-Language_Translator

An android app that allows you to translate text and phrases between 90+...

30
Emerging
2391 legekka/GanyuTTS

A small VITS+SOVITS/RVC TTS API

30
Emerging
2392 Vishnu-tppr/NEXORA-AI

Made with Python, crafted by Vishnu 💻✨ Nexora AI – A smart Python voice...

30
Emerging
2393 koudounasalkis/AI4Voice

This repo contains the code for "Voice Disorder Analysis: A...

30
Emerging
2394 msalhab96/MultiSpeech

pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with...

30
Emerging
2395 xnmeet/voi

一个基于 [Bob](https://bobtranslate.com/) 的文本转语音插件,使用 Kokoro 本地部署模型作为语音合成服务。

30
Emerging
2396 LuluW8071/Conformer

End-to-End Speech Recognition Training with Conformer CTC using PyTorch Lightning⚡

30
Emerging
2397 skit-ai/speech-recognition

SDKs and docs for Skit's speech to text service

30
Emerging
2398 Yuan-ManX/ComfyUI-ChatterboxTTS

ComfyUI-ChatterboxTTS is now available in ComfyUI, Chatterbox is the first...

30
Emerging
2399 AsoSoft/AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish

AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech

30
Emerging
2400 fernicar/Parakeet_GUI_TINS_Edition

A desktop application built using the TINS paradigm for transcribing audio...

30
Emerging
« Prev 1 2 3 22 23 24 25 26 68 69 70 Next »