All Voice AI Tools

6,981 tools ranked by quality score · Page 11 of 70

Showing 1001–1100 of 6,981
# Tool Score Tier
1001 R1ckShi/AESRC2020

[ICASSP2021] Data preperation scripts, training pipeline and baseline...

43
Emerging
1002 chrisurf/obsidian-voice

🔊 The Obsidian Voice plugin lets you listen to your written content being...

43
Emerging
1003 mitchib1440/SpeakThat

The world's most comprehensive notification reader for Android devices.

43
Emerging
1004 thetobysiu/Deepstory

Deepstory turns a text/generated text into a video where the character is...

43
Emerging
1005 dbklim/Voice_ChatBot

Chatbot in russian with speech recognition using PocketSphinx and speech...

43
Emerging
1006 Appen/UHV-OTS-Speech

A data annotation pipeline to generate high-quality, large-scale speech...

43
Emerging
1007 gunarakulangunaretnam/real-time-language-translator

A voice recognition-based tool for translating languages in real-time.

43
Emerging
1008 rishikksh20/melgan

MelGAN implementation with Multi-Band and Full Band supports...

43
Emerging
1009 xyqfer/reader

毕业设计-基于智能手机的报纸阅读器

43
Emerging
1010 ai-learning-tools/viva-translate

Real-time translation copilot for your browser

43
Emerging
1011 candlewill/Speech-Corpus-Collection

A Collection of Speech Corpus for ASR and TTS

43
Emerging
1012 ttop32/coqui_tts_korea

Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS

43
Emerging
1013 jianchang512/zh_recogn

将音频或视频中的中文语音识别并导出为srt字幕,基于魔塔社区Paraformer模型

43
Emerging
1014 stefantaubert/mel-cepstral-distance

A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral...

43
Emerging
1015 AIFSH/ComfyUI-XTTS

a custom comfyui node for coqui-ai/TTS's xtts module! support 17 languages...

43
Emerging
1016 tomasz-oponowicz/spoken_language_identification

Identify a spoken language using artificial intelligence (LID).

43
Emerging
1017 sp-nitech/DNN-HSMM

pytorch implementation of DNN-HSMM for TTS

43
Emerging
1018 EricBatlle/UnityAndroidSpeechRecognizer

🗣️ Speech recognition on Unity and Android without the annoying google popup!

43
Emerging
1019 taresh18/TTSizer

🎙️ Automatically transcribe audio/video into high-quality, speaker-specific...

43
Emerging
1020 zceng/LVCNet

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation

43
Emerging
1021 keonlee9420/WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement...

43
Emerging
1022 chenmingxiang110/Chinese-automatic-speech-recognition

Chinese speech recognition

43
Emerging
1023 kurianbenoy/Indic-Subtitler

Open source subtitling platform 💻 for transcribing and translating...

43
Emerging
1024 teamsudocode/dexter

Let your talking do the code

43
Emerging
1025 garvys-org/rustfst

Rust re-implementation of OpenFST - library for constructing, combining,...

43
Emerging
1026 asticode/go-astideepspeech

Golang bindings for Mozilla's DeepSpeech speech-to-text library

43
Emerging
1027 journey-ad/CosyVoice2-Ex

CosyVoice2 功能扩充(预训练音色推理/3s极速复刻/自然语言控制/自动识别/音色模型保存/API)

43
Emerging
1028 chandran-jr/Noteify

🔎A Currency Detection app for the visually impaired which automatically...

43
Emerging
1029 keonlee9420/Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based...

43
Emerging
1030 cboard-org/cboard-api

Cboard API provides backend functionality and persistence to the Cboard application

43
Emerging
1031 xxbb1234021/speech_recognition

中文语音识别

43
Emerging
1032 JosefAlbers/WTM

Blazing fast whisper turbo for ASR (speech-to-text) tasks

43
Emerging
1033 p0p4k/pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper

43
Emerging
1034 devnen/Kitten-TTS-Server

Self-host the ultra-lightweight Kitten TTS model with this enhanced API...

43
Emerging
1035 Sgvkamalakar/Azure-Talking-Avatar

Explore the power of Azure Text-to-Speech with interactive talking avatar,...

43
Emerging
1036 metame-ai/awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation,...

43
Emerging
1037 developers-cosmos/Mimasa

Real time multilingual face translator

43
Emerging
1038 hi-paris/Prosody-Control-French-TTS

An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control

43
Emerging
1039 AASHISHAG/deepspeech-german

Automatic Speech Recognition (ASR) - German

43
Emerging
1040 alex-vt/WhisperInput

Offline voice input panel & keyboard with punctuation for Android.

43
Emerging
1041 agan-j/xiaoniu

小牛视频翻译 是一款支持本地视频翻译、字幕翻译和 YouTube 视频翻译下载的 AI...

43
Emerging
1042 rhulha/StreamingKokoroJS

Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100%...

43
Emerging
1043 Rongjiehuang/GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model...

43
Emerging
1044 keonlee9420/PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative...

43
Emerging
1045 NickZaitsev/ru-normalizr

ru-normalizr — лучший open-source нормализатор русского текста. Приводит...

43
Emerging
1046 fedden/RenderMan

Command line C++ and Python VSTi Host library with MFCC, FFT, RMS and audio...

43
Emerging
1047 tema6120/ForgetMeNot

A flashcard app for Android.

43
Emerging
1048 ponlponl123/-Prototype-AIVTuber

a open-source Artificial Intelligence Virtual Youtuber (AI VTuber), (this...

43
Emerging
1049 novoic/surfboard

Novoic's audio feature extraction library

43
Emerging
1050 ryanleary/patter

speech-to-text in pytorch

43
Emerging
1051 timmo001/home-assistant-assist-desktop

Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux

43
Emerging
1052 mush42/sonata

A cross-platform inference engine for neural TTS models.

43
Emerging
1053 huakunyang/SummerTTS

SummerTTS...

43
Emerging
1054 voicetestdev/voicetest

Test harness for voice agents. Import from Retell, VAPI, Bland, LiveKit. Run...

43
Emerging
1055 keonlee9420/VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based...

43
Emerging
1056 algolia/voice-overlay-ios

🗣 An overlay that gets your user’s voice permission and input as text in a...

43
Emerging
1057 linux-speakup/espeakup

a light weight connector for espeak-ng and speakup

43
Emerging
1058 huawei-noah/Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei...

43
Emerging
1059 wangkaisine/mrcp-plugin-with-freeswitch

使用FreeSWITCH接受用户手机呼叫,通过UniMRCP...

43
Emerging
1060 qianchang/zici

字词:收集国学/汉语字词拼音相关资源

43
Emerging
1061 Jakobovski/free-spoken-digit-dataset

A free audio dataset of spoken digits. An audio version of MNIST.

43
Emerging
1062 superstarryeyes/lue

Terminal eBook Reader with Audiobook-Quality Text-to-Speech — Supports EPUB,...

43
Emerging
1063 Baidu-AIP/speech-demo

语音api示例

43
Emerging
1064 karim23657/Persian-tts-coqui

Persian/Farsi text to speech(TTS) training using coqui tts

43
Emerging
1065 kaituoxu/Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with...

43
Emerging
1066 hujingshuang/MTrans

Multi-source Translation

43
Emerging
1067 mozilla/DeepSpeech-examples

Examples of how to use or integrate DeepSpeech

43
Emerging
1068 jtkim-kaist/VAD

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM...

43
Emerging
1069 bjoernkarmann/project_alias

Alias is a teachable “parasite” that is designed to give users more control...

43
Emerging
1070 OpenCOVID19CoughCheck/CoughCheckApp

Development of AI audio app to compare the cough of a Coronavirus (COVID-19)...

43
Emerging
1071 wangz-code/legado-tts

Book Reader阅读Legado 应用内置EdgeTTS大声朗读, 听书无需额外部署 即装即听, 语音引擎采用rany2/edge-tts...

43
Emerging
1072 caizexin/tf_multispeakerTTS_fc

the Tensorflow version of multi-speaker TTS training with feedback constraint

43
Emerging
1073 nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps...

43
Emerging
1074 abdozmantar/ComfyUI-DeepExtractV2

DeepExtractV2 – lightning-fast, high-quality audio separator. Instantly...

43
Emerging
1075 nobody132/masr

中文语音识别; Mandarin Automatic Speech Recognition;

43
Emerging
1076 alexpinel/Dot

Text-To-Speech, RAG, and LLMs. All local!

43
Emerging
1077 LiberSonora/LiberSonora

LiberSonora,寓意“自由的声音”,是一个 AI 赋能的、强大的、开源有声书工具集,包含智能字幕提取、AI标题生成、多语言翻译等功能,支持...

43
Emerging
1078 hcy71o/AutoVocoder

Autovocoder: Fast Waveform Generation from a Learned Speech Representation...

42
Emerging
1079 primaryobjects/voice-gender

Gender recognition by voice and speech analysis

42
Emerging
1080 BogiHsu/WG-WaveNet

Real-Time High-Fidelity Speech Synthesis without GPU

42
Emerging
1081 MattePalte/Verbify-TTS

Simple and free Text-to-Speech (TTS) engine that reads for you any text on...

42
Emerging
1082 shekit/alexa-sign-language-translator

A project to make Amazon Echo respond to sign language using your webcam

42
Emerging
1083 Rongjiehuang/Multi-Singer

PyTorch Implementation of Multi-Singer (ACM-MM'21)

42
Emerging
1084 PyThaiNLP/tts-thai

Thai TTS

42
Emerging
1085 lukeewin/FunASR_API

这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech...

42
Emerging
1086 Navatusein/Silero-TTS-Service

Silero TTS backend service. Can be used with Home Assistant and Rhasspy.

42
Emerging
1087 keonlee9420/StyleSpeech

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive...

42
Emerging
1088 artibex/piper-http

Creates a docker image that runs the piper http service

42
Emerging
1089 botany-labs/voice-ai-js-starter

Starter project for building real-time AI Voice Assistants

42
Emerging
1090 repodiac/german_transliterate

Python module to clean and transliterate (i.e. normalize) German text...

42
Emerging
1091 mravanelli/pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid...

42
Emerging
1092 Saganaki22/ComfyUI-Maya1_TTS

A ComfyUI node for Maya1, a 3B-parameter speech model built for expressive...

42
Emerging
1093 googlecreativelab/obvi

A Polymer 3+ webcomponent / button for doing speech recognition

42
Emerging
1094 trungnguyen21/AutomatedYoutubeShorts

Automatically Generate video based on given content!

42
Emerging
1095 keonlee9420/FastPitchFormant

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based...

42
Emerging
1096 neosapience/editts

Official implementation of EdiTTS: Score-based Editing for Controllable...

42
Emerging
1097 sberdevices/assistant-client

Инструмент для тестирования и отладки СanvasApps — навыков семейства...

42
Emerging
1098 googlecreativelab/morse-speak-demo

Text-to-Speech (TTS) demo web app that converts written text into spoken...

42
Emerging
1099 declare-lab/jamify

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and...

42
Emerging
1100 michaelzhang-ai/Text2Video

ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with...

42
Emerging
« Prev 1 2 3 9 10 11 12 13 68 69 70 Next »