All Voice AI Tools
6,981 tools ranked by quality score · Page 7 of 70
| # | Tool | Score | Tier |
|---|---|---|---|
| 601 |
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support... |
|
Emerging |
| 602 |
ringger/transcribe-critic
Multi-source transcript merging inspired by textual criticism — LLM... |
|
Emerging |
| 603 |
zlargon/google-tts
Google TTS (Text-To-Speech) for node.js |
|
Emerging |
| 604 |
AutoArk/GPA
[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion... |
|
Emerging |
| 605 |
artcore-c/AI-Voice-Clone-with-Coqui-XTTS-v2
Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone... |
|
Emerging |
| 606 |
devnen/Dia-TTS-Server
Self-host the powerful Dia TTS model. This server offers a user-friendly Web... |
|
Emerging |
| 607 |
alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in... |
|
Emerging |
| 608 |
Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning |
|
Emerging |
| 609 |
tarepan/VoiceConversionLab
Collect Voice Conversion researches |
|
Emerging |
| 610 |
KKshitiz/J.A.R.V.I.S
Iron man inspired Personal virtual assistant |
|
Emerging |
| 611 |
jxzhanggg/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC |
|
Emerging |
| 612 |
just-ai/aimybox-android-assistant
Embeddable custom voice assistant for Android applications |
|
Emerging |
| 613 |
abhirooptalasila/AutoSub
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using... |
|
Emerging |
| 614 |
dngda/bot-whatsapp
Unmaintained - Multipurpose WhatsApp Bot 🤖 using open-wa/wa-automate-nodejs... |
|
Emerging |
| 615 |
yl4579/AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) |
|
Emerging |
| 616 |
openspeech-team/openspeech
Open-Source Toolkit for End-to-End Speech Recognition leveraging... |
|
Emerging |
| 617 |
HiMeditator/auto-caption
A cross-platform real-time subtitle display software. 一个跨平台的实时字幕显示软件。 |
|
Emerging |
| 618 |
reriiasu/speech-to-text
Real-time transcription using faster-whisper |
|
Emerging |
| 619 |
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment |
|
Emerging |
| 620 |
tsurumeso/vocal-remover
Vocal Remover using Deep Neural Networks |
|
Emerging |
| 621 |
keenresearch/keenasr-ios-poc
Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE... |
|
Emerging |
| 622 |
jpuigcerver/Laia
Laia: A deep learning toolkit for HTR based on Torch |
|
Emerging |
| 623 |
Aivis-Project/AivisSpeech
AivisSpeech: AI Voice Imitation System - Text to Speech Software |
|
Emerging |
| 624 |
gitmylo/bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference. |
|
Emerging |
| 625 |
blaisewf/rvc-cli
🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free! |
|
Emerging |
| 626 |
KinglittleQ/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling,... |
|
Emerging |
| 627 |
BoltzmannEntropy/xtts2-ui
A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech |
|
Emerging |
| 628 |
gentaiscool/end2end-asr-pytorch
End-to-End Automatic Speech Recognition on PyTorch |
|
Emerging |
| 629 |
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and... |
|
Emerging |
| 630 |
gooofy/zamia-speech
Open tools and data for cloudless automatic speech recognition |
|
Emerging |
| 631 |
XiaoMi/kaldi-onnx
Kaldi model converter to ONNX |
|
Emerging |
| 632 |
haoheliu/voicefixer_main
General Speech Restoration |
|
Emerging |
| 633 |
EnjiRouz/Voice-Assistant-App
Python Voice Assistant project can: recognize and synthesize speech without... |
|
Emerging |
| 634 |
filippogiruzzi/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow |
|
Emerging |
| 635 |
gexgd0419/NaturalVoiceSAPIAdapter
Make Azure natural TTS voices accessible to any SAPI 5-compatible application. |
|
Emerging |
| 636 |
yl4579/PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions |
|
Emerging |
| 637 |
rolczynski/Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow) |
|
Emerging |
| 638 |
harry0703/AudioNotes
快速提取音视频内容,整理成一份结构化的markdown笔记 |
|
Emerging |
| 639 |
clovaai/ClovaCall
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020) |
|
Emerging |
| 640 |
iamjanvijay/rnnt_decoder_cuda
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA. |
|
Emerging |
| 641 |
fulldecent/vowel-practice
iOS application for finding formants in spoken sounds |
|
Emerging |
| 642 |
rishikksh20/FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End... |
|
Emerging |
| 643 |
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech... |
|
Emerging |
| 644 |
philipperemy/tensorflow-ctc-speech-recognition
Application of Connectionist Temporal Classification (CTC) for Speech... |
|
Emerging |
| 645 |
andresayac/edge-tts
Edge TTS is a Node or Bun package that allows access to the online... |
|
Emerging |
| 646 |
thepirat000/spleeter-api
Audio separation API using Spleeter from Deezer |
|
Emerging |
| 647 |
PlayVoice/vits_chinese
Best practice TTS based on BERT and VITS with some Natural Speech Features... |
|
Emerging |
| 648 |
Mag1cFall/AIStudio2API
将AI Studio反代成OpenAI兼容的API | OpenAI-compatible API proxy for Google AI Studio |
|
Emerging |
| 649 |
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E |
|
Emerging |
| 650 |
voicekit-team/T-one
T-one is a high-performance streaming ASR pipeline for Russian, specialized... |
|
Emerging |
| 651 |
prateekkalra/Selection-js
A lightweight javascipt library which provides users with a set of options... |
|
Emerging |
| 652 |
cvqluu/Factorized-TDNN
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal... |
|
Emerging |
| 653 |
saiteja-talluri/Speech2Face
Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face... |
|
Emerging |
| 654 |
symblai/speech-recognition-evaluation
Evaluate results from ASR/Speech-to-Text quickly |
|
Emerging |
| 655 |
roatienza/efficientspeech
PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023. |
|
Emerging |
| 656 |
IBM/MAX-Speech-to-Text-Converter
Converts spoken words into text form. |
|
Emerging |
| 657 |
linto-ai/linto-studio
Transcription and annotation interface for recorded audio or video files |
|
Emerging |
| 658 |
microsoft/UniSpeech
UniSpeech - Large Scale Self-Supervised Learning for Speech |
|
Emerging |
| 659 |
yy4382/tts-importer
轻松将 Azure TTS 语音合成服务导入阅读软件。现支持阅读(legado)、爱阅记、源阅读。 |
|
Emerging |
| 660 |
alphacep/vosk-asterisk
Speech Recognition in Asterisk with Vosk Server |
|
Emerging |
| 661 |
mediatechlab/tts-wrapper
TTS-Wrapper makes it easier to use text-to-speech APIs by providing a... |
|
Emerging |
| 662 |
LEEYOONHYUNG/BVAE-TTS
Official implementation of BVAE-TTS |
|
Emerging |
| 663 |
StephenVinouze/KontinuousSpeechRecognizer
A Kotlin Speech Recognizer that runs continuously and is triggered with an... |
|
Emerging |
| 664 |
hs-CN/msedge-tts
This library is a wrapper of MSEdge Read aloud function API. You can use it... |
|
Emerging |
| 665 |
IS2AI/Kazakh_TTS
An expanded version of the previously released Kazakh text-to-speech... |
|
Emerging |
| 666 |
Nikorasu/LiveWhisper
A nearly-live implementation of OpenAI's Whisper, using sounddevice.... |
|
Emerging |
| 667 |
BandarLabs/gitpodcast
Convert any git repository into an engaging podcast |
|
Emerging |
| 668 |
chenyme/Chenyme-AAVT
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。 |
|
Emerging |
| 669 |
coqui-ai/STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying... |
|
Emerging |
| 670 |
exPHAT/SwiftWhisper
🎤 The easiest way to transcribe audio in Swift |
|
Emerging |
| 671 |
lmnt-com/wavegrad
A fast, high-quality neural vocoder. |
|
Emerging |
| 672 |
EgorLakomkin/KTSpeechCrawler
Automatically constructing corpus for automatic speech recognition from... |
|
Emerging |
| 673 |
tiberiu44/TTS-Cube
End-2-end speech synthesis with recurrent neural networks |
|
Emerging |
| 674 |
JJWRoeloffs/transcribe_align_textgrid
A small wrapper package around whisper-timestamped. Create force-aligned... |
|
Emerging |
| 675 |
ArchishmanSengupta/autovoiceevals
A self-improving loop for voice AI agents. Uses karpathy's autoresearch as... |
|
Emerging |
| 676 |
1038lab/ComfyUI-EdgeTTS
ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging... |
|
Emerging |
| 677 |
gitmylo/audio-webui
A webui for different audio related Neural Networks |
|
Emerging |
| 678 |
bricewalker/Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the... |
|
Emerging |
| 679 |
shamspias/vibevoice-studio
Beautiful voice app: record or upload to train a voice, generate speech from... |
|
Emerging |
| 680 |
Ashish-Patnaik/kokoclone
Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and... |
|
Emerging |
| 681 |
ycyy/edge-tts-webui
edge-tts webui |
|
Emerging |
| 682 |
puntorigen/podcast_tts
A class for generating realistic audio (TTS) for podcasts and dialogues. |
|
Emerging |
| 683 |
bensonruan/Chrome-Web-Speech-API
Chrome Web Speech API |
|
Emerging |
| 684 |
dspavankumar/keras-kaldi
Keras Interface for Kaldi ASR |
|
Emerging |
| 685 |
DrewThomasson/VoxNovel
VoxNovel: generate audiobooks giving each character a different voice actor. |
|
Emerging |
| 686 |
BatuhanYilmaz26/Auto-Subtitled-Video-Generator
Input a YouTube video link or upload a video file and get a video with subtitles. |
|
Emerging |
| 687 |
atomicoo/tacotron2-mandarin
Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on... |
|
Emerging |
| 688 |
VidyasagarMSC/WatBot
An Android ChatBot powered by IBM Watson Services (Assistant V1,... |
|
Emerging |
| 689 |
xenova/whisper-web
ML-powered speech recognition directly in your browser |
|
Emerging |
| 690 |
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation |
|
Emerging |
| 691 |
travisvn/edge-tts-client
Client-side (web browser) implementation of Edge TTS package — Microsoft... |
|
Emerging |
| 692 |
BuildWithAIs/voicekey
Voice to text, one key to input. |
|
Emerging |
| 693 |
yl4579/PitchExtractor
Deep Neural Pitch Extractor for Voice Conversion and TTS Training |
|
Emerging |
| 694 |
BogiHsu/Tacotron2-PyTorch
Yet another PyTorch implementation of Tacotron 2 with reduction factor and... |
|
Emerging |
| 695 |
amazon-archives/amazon-polly-sample
Sample application for Amazon Polly. Allows to convert any blog into an... |
|
Emerging |
| 696 |
longluo/EbookReader
The EbookReader Android App. Support file format like epub, pdf, txt, html,... |
|
Emerging |
| 697 |
jimbozhang/kaldi-gop
Kaldi-based goodness of pronunciation (GOP) |
|
Emerging |
| 698 |
louiskirsch/speechT
An opensource speech-to-text software written in tensorflow |
|
Emerging |
| 699 |
supershaneski/openai-whisper-talk
openai-whisper-talk is a sample voice conversation application powered by... |
|
Emerging |
| 700 |
Kyubyong/tacotron_asr
Speech Recognition Using Tacotron |
|
Emerging |