All Voice AI Tools

6,981 tools ranked by quality score · Page 7 of 70

Showing 601–700 of 6,981

« Prev Next »

#	Tool	Score	Tier	Category	Stars	Language
601	myshell-ai/MeloTTS High-quality multi-lingual text-to-speech library by MyShell.ai. Support...	48	Emerging	lightweight-tts-runtimes	7,267	Python
602	ringger/transcribe-critic Multi-source transcript merging inspired by textual criticism — LLM...	48	Emerging	whisper-diarization	14	Python
603	zlargon/google-tts Google TTS (Text-To-Speech) for node.js	48	Emerging	google-tts-libraries	286	JavaScript
604	AutoArk/GPA [AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion...	48	Emerging	telegram-voice-transcription	97	Python
605	artcore-c/AI-Voice-Clone-with-Coqui-XTTS-v2 Free voice cloning for creators using Coqui XTTS-v2 on Google Colab. Clone...	48	Emerging	voice-cloning-tools	34	Python
606	devnen/Dia-TTS-Server Self-host the powerful Dia TTS model. This server offers a user-friendly Web...	48	Emerging	self-hosted-tts-servers	346	Python
607	alesaccoia/VoiceStreamAI Near-Realtime audio transcription using self-hosted Whisper and WebSocket in...	48	Emerging	speech-to-text-converters	950	Python
608	Picovoice/cobra On-device voice activity detection (VAD) powered by deep learning	48	Emerging	ios-speech-frameworks	248	Python
609	tarepan/VoiceConversionLab Collect Voice Conversion researches	48	Emerging	voice-cloning-synthesis	96	TypeScript
610	KKshitiz/J.A.R.V.I.S Iron man inspired Personal virtual assistant	48	Emerging	python-voice-assistants	72	Python
611	jxzhanggg/nonparaSeq2seqVC_code Implementation code of non-parallel sequence-to-sequence VC	48	Emerging	fastspeech-tts-models	248	Python
612	just-ai/aimybox-android-assistant Embeddable custom voice assistant for Android applications	48	Emerging	android-voice-assistants	274	Kotlin
613	abhirooptalasila/AutoSub A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using...	48	Emerging	whisper-subtitle-generation	651	Python
614	dngda/bot-whatsapp Unmaintained - Multipurpose WhatsApp Bot 🤖 using open-wa/wa-automate-nodejs...	48	Emerging	telegram-voice-transcription	93	JavaScript
615	yl4579/AuxiliaryASR Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)	48	Emerging	end-to-end-asr-frameworks	125	Python
616	openspeech-team/openspeech Open-Source Toolkit for End-to-End Speech Recognition leveraging...	48	Emerging	end-to-end-asr-frameworks	718	Python
617	HiMeditator/auto-caption A cross-platform real-time subtitle display software. 一个跨平台的实时字幕显示软件。	48	Emerging	live-caption-generation	497	TypeScript
618	reriiasu/speech-to-text Real-time transcription using faster-whisper	48	Emerging	speech-to-text-converters	613	HTML
619	SuyashMore/MevonAI-Speech-Emotion-Recognition Identify the emotion of multiple speakers in an Audio Segment	48	Emerging	speech-emotion-recognition	179	C
620	tsurumeso/vocal-remover Vocal Remover using Deep Neural Networks	48	Emerging	audio-source-separation	1,744	Python
621	keenresearch/keenasr-ios-poc Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE...	48	Emerging	ios-speech-frameworks	70	Objective-C
622	jpuigcerver/Laia Laia: A deep learning toolkit for HTR based on Torch	48	Emerging	text-to-speech-frameworks	151	Shell
623	Aivis-Project/AivisSpeech AivisSpeech: AI Voice Imitation System - Text to Speech Software	48	Emerging	openai-tts-applications	423	TypeScript
624	gitmylo/bark-voice-cloning-HuBERT-quantizer The code for the bark-voicecloning model. Training and inference.	48	Emerging	voice-cloning-tools	711	Python
625	blaisewf/rvc-cli 🚀 RVC + UVR = A perfect set of tools for voice cloning, easily and free!	48	Emerging	voice-cloning-synthesis	230	Python
626	KinglittleQ/GST-Tacotron A PyTorch implementation of Style Tokens: Unsupervised Style Modeling,...	48	Emerging	tacotron-tts-models	374	Python
627	BoltzmannEntropy/xtts2-ui A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speech	48	Emerging	coqui-tts-applications	391	Python
628	gentaiscool/end2end-asr-pytorch End-to-End Automatic Speech Recognition on PyTorch	48	Emerging	end-to-end-asr-frameworks	304	Python
629	keonlee9420/STYLER Official repository of STYLER: Style Factor Modeling with Rapidity and...	48	Emerging	fastspeech-tts-models	160	Python
630	gooofy/zamia-speech Open tools and data for cloudless automatic speech recognition	48	Emerging	automatic-speech-recognition	446	Python
631	XiaoMi/kaldi-onnx Kaldi model converter to ONNX	48	Emerging	kaldi-asr-ecosystem	247	Python
632	haoheliu/voicefixer_main General Speech Restoration	48	Emerging	speaker-diarization-embedding	284	Python
633	EnjiRouz/Voice-Assistant-App Python Voice Assistant project can: recognize and synthesize speech without...	48	Emerging	general-purpose-voice-assistants	129	Python
634	filippogiruzzi/voice_activity_detection Voice Activity Detection based on Deep Learning & TensorFlow	48	Emerging	speaker-diarization-embedding	371	Python
635	gexgd0419/NaturalVoiceSAPIAdapter Make Azure natural TTS voices accessible to any SAPI 5-compatible application.	48	Emerging	dotnet-tts-libraries	702	C++
636	yl4579/PL-BERT Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions	48	Emerging	fastspeech-tts-models	268	Python
637	rolczynski/Automatic-Speech-Recognition 🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)	48	Emerging	keyword-speech-recognition	223	Python
638	harry0703/AudioNotes 快速提取音视频内容，整理成一份结构化的markdown笔记	48	Emerging	meeting-transcription-summarizers	1,993	Python
639	clovaai/ClovaCall ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)	48	Emerging	end-to-end-asr-frameworks	223	Python
640	iamjanvijay/rnnt_decoder_cuda An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.	48	Emerging	end-to-end-asr-frameworks	67	Cuda
641	fulldecent/vowel-practice iOS application for finding formants in spoken sounds	48	Emerging	ios-speech-frameworks	66	Swift
642	rishikksh20/FastSpeech2 PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End...	48	Emerging	fastspeech-tts-models	233	Jupyter Notebook
643	lucasnewman/best-rq-pytorch Implementation of BEST-RQ - a model for self-supervised learning of speech...	48	Emerging	neural-vocoder-implementations	133	Python
644	philipperemy/tensorflow-ctc-speech-recognition Application of Connectionist Temporal Classification (CTC) for Speech...	48	Emerging	ctc-asr-implementations	131	Python
645	andresayac/edge-tts Edge TTS is a Node or Bun package that allows access to the online...	48	Emerging	edge-tts-implementations	121	TypeScript
646	thepirat000/spleeter-api Audio separation API using Spleeter from Deezer	48	Emerging	audio-source-separation	121	C#
647	PlayVoice/vits_chinese Best practice TTS based on BERT and VITS with some Natural Speech Features...	48	Emerging	vits-tts-implementations	1,227	Python
648	Mag1cFall/AIStudio2API 将AI Studio反代成OpenAI兼容的API \| OpenAI-compatible API proxy for Google AI Studio	48	Emerging	openai-tts-applications	91	Python
649	enhuiz/vall-e An unofficial PyTorch implementation of the audio LM VALL-E	48	Emerging	tacotron-tts-models	2,992	Python
650	voicekit-team/T-one T-one is a high-performance streaming ASR pipeline for Russian, specialized...	47	Emerging	end-to-end-asr-frameworks	249	Python
651	prateekkalra/Selection-js A lightweight javascipt library which provides users with a set of options...	47	Emerging	web-speech-api-tts	95	JavaScript
652	cvqluu/Factorized-TDNN PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal...	47	Emerging	tacotron-tts-models	149	Python
653	saiteja-talluri/Speech2Face Implementation of the CVPR 2019 Paper - Speech2Face: Learning the Face...	47	Emerging	fastspeech-tts-models	178	Python
654	symblai/speech-recognition-evaluation Evaluate results from ASR/Speech-to-Text quickly	47	Emerging	asr-evaluation-metrics	41	JavaScript
655	roatienza/efficientspeech PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.	47	Emerging	fastspeech-tts-models	180	Jupyter Notebook
656	IBM/MAX-Speech-to-Text-Converter Converts spoken words into text form.	47	Emerging	speech-recognition-apis	76	Python
657	linto-ai/linto-studio Transcription and annotation interface for recorded audio or video files	47	Emerging	whisper-diarization	53	JavaScript
658	microsoft/UniSpeech UniSpeech - Large Scale Self-Supervised Learning for Speech	47	Emerging	voice-ai-learning-collections	479	Python
659	yy4382/tts-importer 轻松将 Azure TTS 语音合成服务导入阅读软件。现支持阅读（legado）、爱阅记、源阅读。	47	Emerging	google-tts-libraries	397	TypeScript
660	alphacep/vosk-asterisk Speech Recognition in Asterisk with Vosk Server	47	Emerging	vosk-asr-implementations	128	C
661	mediatechlab/tts-wrapper TTS-Wrapper makes it easier to use text-to-speech APIs by providing a...	47	Emerging	lightweight-tts-libraries	21	Python
662	LEEYOONHYUNG/BVAE-TTS Official implementation of BVAE-TTS	47	Emerging	text-to-speech-frameworks	173	Python
663	StephenVinouze/KontinuousSpeechRecognizer A Kotlin Speech Recognizer that runs continuously and is triggered with an...	47	Emerging	android-speech-apps	144	Kotlin
664	hs-CN/msedge-tts This library is a wrapper of MSEdge Read aloud function API. You can use it...	47	Emerging	edge-tts-implementations	19	Rust
665	IS2AI/Kazakh_TTS An expanded version of the previously released Kazakh text-to-speech...	47	Emerging	tts-dataset-creation	147	Shell
666	Nikorasu/LiveWhisper A nearly-live implementation of OpenAI's Whisper, using sounddevice....	47	Emerging	speech-to-text-converters	360	Python
667	BandarLabs/gitpodcast Convert any git repository into an engaging podcast	47	Emerging	content-to-podcast-converters	803	TypeScript
668	chenyme/Chenyme-AAVT 这是一个全自动（音频）视频翻译项目。利用Whisper识别声音，AI大模型翻译字幕，最后合并字幕视频，生成翻译后的视频。	47	Emerging	video-transcription-extraction	2,973	Python
669	coqui-ai/STT 🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying...	47	Emerging	voice-cloning-synthesis	2,572	C++
670	exPHAT/SwiftWhisper 🎤 The easiest way to transcribe audio in Swift	47	Emerging	local-voice-dictation	771	Swift
671	lmnt-com/wavegrad A fast, high-quality neural vocoder.	47	Emerging	audio-noise-reduction	296	Python
672	EgorLakomkin/KTSpeechCrawler Automatically constructing corpus for automatic speech recognition from...	47	Emerging	speech-corpora-datasets	157	Python
673	tiberiu44/TTS-Cube End-2-end speech synthesis with recurrent neural networks	47	Emerging	neural-vocoder-implementations	223	Python
674	JJWRoeloffs/transcribe_align_textgrid A small wrapper package around whisper-timestamped. Create force-aligned...	47	Emerging	video-transcription-extraction	18	Python
675	ArchishmanSengupta/autovoiceevals A self-improving loop for voice AI agents. Uses karpathy's autoresearch as...	47	Emerging	voice-agent-applications	83	Python
676	1038lab/ComfyUI-EdgeTTS ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging...	47	Emerging	comfyui-tts-nodes	66	Python
677	gitmylo/audio-webui A webui for different audio related Neural Networks	47	Emerging	audio-source-separation	1,240	Python
678	bricewalker/Hey-Jetson Deep Learning based Automatic Speech Recognition with attention for the...	47	Emerging	speaker-diarization-embedding	199	Jupyter Notebook
679	shamspias/vibevoice-studio Beautiful voice app: record or upload to train a voice, generate speech from...	47	Emerging	vibe-coding-frameworks	56	Python
680	Ashish-Patnaik/kokoclone Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and...	47	Emerging	kokoro-tts-ecosystem	62	Python
681	ycyy/edge-tts-webui edge-tts webui	47	Emerging	gradio-tts-webuis	111	Python
682	puntorigen/podcast_tts A class for generating realistic audio (TTS) for podcasts and dialogues.	47	Emerging	content-to-podcast-converters	65	Python
683	bensonruan/Chrome-Web-Speech-API Chrome Web Speech API	47	Emerging	audio-transcription-apps	117	JavaScript
684	dspavankumar/keras-kaldi Keras Interface for Kaldi ASR	47	Emerging	kaldi-asr-ecosystem	122	Python
685	DrewThomasson/VoxNovel VoxNovel: generate audiobooks giving each character a different voice actor.	47	Emerging	text-to-speech	352	Python
686	BatuhanYilmaz26/Auto-Subtitled-Video-Generator Input a YouTube video link or upload a video file and get a video with subtitles.	47	Emerging	whisper-transcription-apps	124	Python
687	atomicoo/tacotron2-mandarin Tensorflow implementation of Chinese/Mandarin TTS (Text-to-Speech) based on...	47	Emerging	fastspeech-tts-models	131	Python
688	VidyasagarMSC/WatBot An Android ChatBot powered by IBM Watson Services (Assistant V1,...	47	Emerging	voice-command-assistants	72	Java
689	xenova/whisper-web ML-powered speech recognition directly in your browser	47	Emerging	whisper-speech-transcription	3,257	TypeScript
690	ivcylc/OpenMusic OpenMusic: SOTA Text-to-music (TTM) Generation	47	Emerging	speech-synthesis-diffusion	634	Python
691	travisvn/edge-tts-client Client-side (web browser) implementation of Edge TTS package — Microsoft...	47	Emerging	edge-tts-implementations	22	TypeScript
692	BuildWithAIs/voicekey Voice to text, one key to input.	47	Emerging	local-voice-dictation	142	TypeScript
693	yl4579/PitchExtractor Deep Neural Pitch Extractor for Voice Conversion and TTS Training	47	Emerging	tacotron-tts-models	147	Python
694	BogiHsu/Tacotron2-PyTorch Yet another PyTorch implementation of Tacotron 2 with reduction factor and...	47	Emerging	tacotron-tts-models	148	Python
695	amazon-archives/amazon-polly-sample Sample application for Amazon Polly. Allows to convert any blog into an...	47	Emerging	aws-polly-tts	152	Python
696	longluo/EbookReader The EbookReader Android App. Support file format like epub, pdf, txt, html,...	47	Emerging	ai-powered-ereaders	154	Java
697	jimbozhang/kaldi-gop Kaldi-based goodness of pronunciation (GOP)	47	Emerging	kaldi-asr-ecosystem	159	C++
698	louiskirsch/speechT An opensource speech-to-text software written in tensorflow	47	Emerging	wav2vec2-asr-models	160	Python
699	supershaneski/openai-whisper-talk openai-whisper-talk is a sample voice conversation application powered by...	47	Emerging	conversational-rag-agents	163	JavaScript
700	Kyubyong/tacotron_asr Speech Recognition Using Tacotron	47	Emerging	tacotron-tts-models	164	Python

« Prev 1 2 3 … 5 6 7 8 9 … 68 69 70 Next »