Voice AI Categories
.NET TTS Libraries
.NET/PowerShell libraries and SDKs for text-to-speech integration across multiple providers (Azure, OpenAI, Microsoft Speech SDK). Does NOT include general voice-ai SDKs, platform-specific implementations, or non-.NET speech tools.
203 tools
General Purpose Voice Assistants
Standalone voice assistant applications with integrated speech recognition, NLP, and task automation capabilities. Does NOT include specialized assistants (emergency, customer service, voice cloning), mobile-only implementations, or tools focused primarily on a single function like speech-to-text.
187 tools
Lightweight TTS Libraries
Minimal, dependency-light text-to-speech implementations and wrappers for local/offline synthesis. Does NOT include API wrappers, cloud-based services, speech recognition, or production-grade TTS engines.
185 tools
Automatic Speech Recognition
Libraries, frameworks, and tools for building, training, and evaluating automatic speech recognition (ASR) systems. Does NOT include pre-built transcription APIs, TTS systems, or ASR applications like transcription apps or meeting summarizers.
161 tools
Web Speech API Libraries
Angular and JavaScript libraries wrapping the browser's native Web Speech API for speech recognition functionality. Does NOT include commercial speech APIs (Speechly, Deepgram), text-to-speech, or framework-agnostic speech frameworks.
149 tools
Web Speech API TTS
Browser-native text-to-speech implementations using the Web Speech API for client-side voice synthesis. Does NOT include cloud TTS services, advanced voice cloning, specialized model implementations (TensorFlow, Coqui, etc.), or documentation/content generation tools.
149 tools
Speech-To-Text Converters
Tools that transcribe audio files or streams into text using Whisper or similar models. Does NOT include diarization, video processing, subtitle generation, voice typing, or server/API implementations.
147 tools
Android Speech Apps
Native Android applications integrating speech-to-text (STT) and text-to-speech (TTS) capabilities for mobile use cases like messaging, navigation, and accessibility. Does NOT include cloud APIs, SDKs, or web-based tools.
113 tools
Keyword Speech Recognition
Machine learning models for recognizing isolated spoken words/commands from audio using CNNs, RNNs, and neural networks. Does NOT include continuous speech-to-text ASR, end-to-end speech recognition pipelines, or general audio classification beyond single-word detection.
112 tools
End-to-End ASR Frameworks
PyTorch-based implementations of complete automatic speech recognition systems with integrated acoustic modeling, feature extraction, and decoding. Does NOT include ASR evaluation metrics, language models, individual components (vocoder, G2P), or non-PyTorch frameworks like Kaldi-only solutions.
109 tools
Local Voice Assistants
Complete offline voice assistant systems that combine speech recognition, language models, and text-to-speech into integrated conversational agents running entirely on local hardware. Does NOT include individual components (ASR, TTS, LLM separately), cloud-dependent assistants, or specialized applications like coding assistants or language tutoring.
101 tools
iOS Speech Frameworks
Native iOS/macOS/tvOS applications and SDKs wrapping Apple's Speech.framework and AVFoundation for text-to-speech, speech recognition, or voice synthesis. Does NOT include cloud-only services, Python/web implementations, or general voice AI applications without iOS-native code.
99 tools
Self-Hosted TTS Servers
Complete server implementations and APIs for running TTS models locally, including voice cloning and streaming capabilities. Does NOT include TTS libraries, wrappers for commercial services, or standalone applications without API/server components.
97 tools
Voice Agent Applications
End-to-end voice AI agents that handle specific real-world tasks (customer service, insurance claims, healthcare, political outreach, restaurant orders). Does NOT include standalone ASR/TTS components, SDK libraries, or single-capability voice tools.
88 tools
Discord TTS Bots
Discord bots that convert text messages to speech in voice channels. Does NOT include music bots, general Discord bots without TTS functionality, or TTS APIs/services themselves.
86 tools
Python Voice Assistants
Desktop and local Python-based voice assistants with task automation capabilities. Includes general-purpose voice command systems inspired by JARVIS/Alexa. Does NOT include platform-specific implementations (Discord, Slack bots), specialized voice applications (gaming, coding), or core speech/TTS components.
82 tools
Voice Controlled Robotics
Physical robots and robotic platforms controlled via voice commands or speech recognition. Includes DIY robot projects, robot kits, and voice-activated mechanical systems. Does NOT include home automation, virtual assistants, or purely software-based voice applications.
81 tools
Lightweight TTS Runtimes
Lightweight, self-contained Text-to-Speech implementations optimized for edge deployment, local inference, and minimal dependencies (typically C++/ONNX-based). Does NOT include cloud-based TTS APIs, general speech synthesis frameworks, or non-TTS applications.
79 tools
Speech Recognition APIs
Tools and implementations for converting spoken audio to text using cloud APIs (Google, etc.) and basic speech-to-text workflows. Does NOT include speech translation, conversational systems, cascading architectures, or domain-specific applications like voice typing editors or JOSM integration.
78 tools
AI Video Generation
Tools for automatically generating short-form or long-form videos from text, topics, or source material using AI for scripting, visuals, voiceovers, and editing. Does NOT include video editing software, speech recognition tools, or real-time video synthesis from physics simulation.
75 tools
Google TTS Libraries
Node.js/JavaScript libraries and wrappers for Google's Text-to-Speech API and Google Translate TTS. Does NOT include other TTS providers (AWS Polly, IBM Watson, ElevenLabs), home automation integrations, or applications built on top of TTS.
75 tools
FastSpeech TTS Models
PyTorch implementations and variants of FastSpeech and FastSpeech2 architectures for neural text-to-speech synthesis. Does NOT include other TTS architectures (Transformer-TTS, Glow-TTS), vocoder implementations, or non-FastSpeech based speech synthesis models.
74 tools
Kokoro TTS Ecosystem
Implementations, deployments, and applications of the Kokoro TTS model across different platforms and formats (ONNX, CoreML, C++, Android, web). Does NOT include other TTS models, voice cloning extensions, or non-Kokoro speech synthesis engines.
72 tools
Voice Cloning Tools
Applications and libraries for cloning voices from audio samples and synthesizing speech in cloned voices. Includes web interfaces, APIs, and local implementations using models like XTTS-v2, Coqui TTS, and YourTTS. Does NOT include general TTS without cloning capability, speech recognition, or avatar/video generation.
71 tools
OpenAI TTS Applications
Web and desktop applications built on OpenAI's text-to-speech API, including wrappers, UI clients, and end-user tools. Does NOT include other TTS providers (Azure, AWS Polly, Piper, ElevenLabs), speech recognition, or lower-level SDKs/libraries.
71 tools
Neural Vocoder Implementations
Tools and models for converting mel-spectrograms or acoustic features into high-fidelity waveforms using neural networks (GANs, diffusion, autoregressive models). Does NOT include end-to-end TTS systems, speech recognition, or general audio processing.
71 tools
Coqui TTS Applications
Production implementations and wrappers around Coqui TTS engine (including XTTS variants) with APIs, servers, language-specific adaptations, and UI frontends. Does NOT include general TTS frameworks, other TTS engines, or speech recognition tools.
70 tools
Tacotron TTS Models
Implementations and variants of Tacotron and Tacotron2 neural architectures for end-to-end text-to-speech synthesis. Does NOT include other TTS architectures (FastSpeech, Glow-TTS, VITS), neural vocoders, or non-TTS applications.
70 tools
Voice Chatbot Applications
Conversational AI systems that process voice input, generate intelligent responses, and output speech. Includes chatbots with speech-to-text and text-to-speech pipelines for dialogue. Does NOT include standalone speech recognition, text-to-speech engines, or non-conversational voice tools.
67 tools
Kaldi ASR Ecosystem
Tools, recipes, models, and utilities built on or for the Kaldi ASR framework, including language-specific implementations, format converters, and training pipelines. Does NOT include non-Kaldi ASR systems, general speech recognition APIs, or TTS tools.
66 tools
CTC ASR Implementations
End-to-end speech recognition systems using Connectionist Temporal Classification (CTC) loss function for acoustic modeling. Includes CTC decoders, training frameworks, and CTC variants (CTC-CRF). Does NOT include general ASR frameworks without explicit CTC focus, TTS systems, or non-neural ASR approaches.
65 tools
Java TTS Libraries
Java-based text-to-speech libraries and frameworks that synthesize speech from text. Includes wrappers around cloud APIs (Google, Microsoft) and offline TTS engines. Does NOT include speech recognition, voice cloning, Android apps, or server/application implementations.
65 tools
Voice Command Assistants
Full-featured AI assistants that combine speech recognition, natural language understanding, and voice synthesis to handle conversational tasks like news retrieval, information lookup, and general assistance. Does NOT include isolated ASR/TTS components, chatbots without voice I/O, or task-specific voice tools like calculators or coding assistants.
65 tools
Qwen3 TTS Applications
Web and desktop applications built on Alibaba's Qwen3-TTS model for text-to-speech synthesis, voice cloning, voice customization, and audiobook generation. Does NOT include ASR tools, non-Qwen TTS models, or lower-level TTS SDKs/APIs.
64 tools
Browser TTS Extensions
Chrome/browser extensions that convert web content (text, subtitles, selected passages) to speech for accessibility and reading assistance. Does NOT include standalone TTS libraries, voice cloning, ASR/transcription, or non-browser applications.
63 tools
Speech Corpora Datasets
Collections and catalogs of annotated speech audio data for training ASR, TTS, and voice AI models. Does NOT include tools for processing/cleaning datasets, annotation pipelines, or model implementations.
63 tools
eBook to Audiobook Conversion
Tools for converting written documents (ebooks, PDFs, EPUBs) into audio format using text-to-speech technology. Does NOT include general TTS engines, podcast creation from scripts, or audio-only content generation without a source document.
62 tools
Edge TTS Implementations
Implementations and integrations of Microsoft Edge's text-to-speech service across different platforms and applications. Does NOT include other TTS engines (AWS Polly, IBM Watson), speech recognition, or general voice AI SDKs.
62 tools
Text To Speech Frameworks
62 tools
Local Voice Dictation
Tools for real-time, on-device speech-to-text input that integrates with system text fields and applications via hotkey activation. Focuses on privacy-preserving local transcription for immediate typing/input workflows. Does NOT include cloud-based transcription, diarization, subtitle generation, or voice synthesis.
62 tools
Whisper Subtitle Generation
Tools that automatically generate subtitle files (SRT, VTT, etc.) from video/audio using speech recognition and optionally translate them. Does NOT include real-time captioning, audio alignment/synchronization libraries, or diarization systems.
60 tools
AI Tutoring Platforms
Interactive AI-powered educational applications that provide personalized instruction, conversational learning, and real-time feedback across subjects through voice, chat, or multimodal interfaces. Does NOT include general educational content generators, speech therapy tools, or standalone speech recognition/pronunciation analysis without tutoring context.
59 tools
Go TTS Libraries
Go/Golang libraries and SDKs for text-to-speech conversion, including integrations with cloud speech APIs and lightweight local TTS engines. Does NOT include applications built on top of TTS, non-Go implementations, or speech recognition (ASR) tools.
59 tools
Content-to-Podcast Converters
Tools that transform written content (emails, web articles, PDFs, spreadsheets) into audio podcast episodes using AI script generation and text-to-speech. Does NOT include general TTS tools, speech translation, or manual podcast production platforms.
58 tools
Voice AI Learning Collections
Educational Python repositories and coding practice collections covering diverse domains (utilities, automation, tutorials). Does NOT include specialized voice-AI tools, production applications, or focused libraries for specific tasks like TTS/ASR.
57 tools
Educational Voice Apps
Mobile and web applications that use voice AI (speech recognition, text-to-speech, voice commands) as core features for learning, teaching, or educational engagement across subjects like language learning, history, accessibility, and skill development. Does NOT include general-purpose voice assistants, voice translation tools without educational context, or accessibility apps focused primarily on independence tasks (banking, navigation, shopping).
56 tools
AI Avatar Platforms
Real-time interactive digital humans with synchronized lip-sync, voice cloning, and conversational AI. Includes avatar creation systems, facial animation, and end-to-end platforms combining video synthesis with voice interaction. Does NOT include standalone TTS, ASR, or video generation tools used independently.
53 tools
Speech AI Coursework
Educational materials, course assignments, tutorials, and seminar presentations focused on teaching speech processing, voice systems, and audio AI fundamentals. Does NOT include production tools, commercial applications, or research papers without educational scaffolding.
53 tools
Voice ChatGPT Interfaces
Conversational AI interfaces that combine ChatGPT/LLMs with speech-to-text and text-to-speech for spoken dialogue. Does NOT include single-purpose assistants (coding, cooking, robotics), SDK libraries, or implementations of specific TTS/ASR engines.
53 tools
Multimodal Medical Assistants
AI healthcare assistants combining voice (speech recognition/synthesis) with vision (image analysis) for medical consultation and diagnosis support. Does NOT include general medical chatbots without multimodal capabilities, voice-only medical apps, or non-healthcare multimodal systems.
53 tools
Android Voice Assistants
Complete voice assistant applications for Android devices with integrated speech recognition and command processing. Does NOT include SDKs/libraries, web-based assistants, or specialized single-function voice tools.
52 tools
TTS Model Fine-Tuning
Repositories for fine-tuning and training text-to-speech models on custom datasets, including LoRA and full model adaptation. Does NOT include pre-built TTS services, inference-only implementations, or general voice cloning without model training.
52 tools
Assistive Vision AI
Tools combining computer vision (object detection, OCR, scene understanding) with voice interface to assist visually impaired users in real-time navigation and environmental awareness. Does NOT include general image-to-speech, security surveillance, or non-accessibility-focused vision systems.
50 tools
Telegram Voice Transcription
Telegram bots that convert voice messages, video notes, and audio files to text transcripts. Does NOT include general chatbots, translation-only tools, or non-Telegram voice applications.
49 tools
Meeting Transcription Summarizers
Tools that automatically transcribe meetings/lectures and generate summaries from audio or video recordings. Includes speaker diarization and timestamp organization. Does NOT include general transcription tools, podcast converters, or lecture slide generation.
49 tools
Voice Controlled Desktop Automation
Tools that use voice commands to control desktop applications, automate system tasks, and interact with OS functions. Does NOT include chatbots, general-purpose assistants without desktop control, or physical hardware projects.
47 tools
FunASR Speech Recognition
Speech recognition APIs and clients built on or wrapping FunASR and similar open-source ASR frameworks. Includes deployment servers, language bindings, and integration layers. Does NOT include text-to-speech, voice assistants, or end-user applications using ASR as a component.
46 tools
Wav2Vec2 ASR Models
Fine-tuning frameworks and implementations of Wav2Vec 2.0 for automatic speech recognition across languages. Does NOT include general ASR systems using other architectures (WaveNet, etc.), TTS, or non-ASR applications of Wav2Vec.
46 tools
Speech Emotion Recognition
Tools for detecting, classifying, and analyzing emotional states from audio speech input. Includes multimodal approaches combining speech with text/lyrics. Does NOT include general sentiment analysis of text-only content, hate speech detection, or emotion-modulated TTS output generation.
45 tools
Wake Word Detection
Tools for detecting specific trigger words or commands in audio streams, typically optimized for embedded/edge devices. Does NOT include general speech recognition, ASR, or speech classification beyond wake-word activation.
45 tools
Vue Speech Recognition
Vue.js components and libraries for integrating Web Speech API and speech-to-text functionality into web applications. Does NOT include text-to-speech, voice synthesis, or non-Vue speech recognition tools.
45 tools
Rust TTS Libraries
Rust bindings, crates, and wrappers for text-to-speech engines and TTS APIs. Does NOT include non-Rust TTS implementations, speech recognition, or higher-level applications built on TTS.
45 tools
Audio Transcription Apps
Web and mobile applications that convert speech to text in real-time or from audio files, with features like note-taking, translation, or formatting. Does NOT include ASR model implementations, TTS synthesis, or voice assistants.
44 tools
eSpeak-NG Ecosystem
Wrappers, bindings, and extensions for eSpeak NG across multiple programming languages and platforms. Does NOT include other TTS engines, general text-to-speech tools, or non-eSpeak speech synthesis projects.
43 tools
Speech Translation Apps
Applications that translate spoken language from one language to another in real-time or near-real-time, combining speech recognition, translation, and text-to-speech synthesis. Does NOT include standalone translation tools, transcription-only apps, or single-language speech synthesis.
43 tools
Zero-Shot Voice Synthesis
Tools for synthesizing speech with zero-shot or few-shot learning, enabling speaker cloning, emotion control, style transfer, and voice conversion without extensive training data. Does NOT include general text-to-speech engines, ASR systems, or non-zero-shot voice synthesis approaches.
43 tools
Deepgram Starter Projects
Beginner-friendly demo applications and starter templates for Deepgram's speech APIs (transcription, text-to-speech, voice agents) across multiple frameworks and languages. Does NOT include SDK libraries, production applications, or non-Deepgram speech tools.
43 tools
Vosk ASR Implementations
Offline speech recognition tools and integrations built on the Vosk toolkit. Does NOT include other ASR engines, text-to-speech, or general voice AI platforms.
42 tools
Gradio TTS WebUIs
Gradio-based web interfaces for text-to-speech and voice synthesis tools. Includes wrapped TTS engines with UI controls for voice selection, speed, and audio export. Does NOT include standalone TTS libraries, non-Gradio web frameworks, or voice cloning without TTS generation.
42 tools
Video Dubbing Tools
End-to-end solutions for automatically translating and dubbing video content with synchronized speech synthesis and voice cloning. Does NOT include general video generation, subtitle tools, or standalone TTS/ASR services.
41 tools
Voice Cloning Synthesis
41 tools
ElevenLabs Integrations
Wrappers, clients, and integrations for the ElevenLabs API and platform. Does NOT include general TTS tools, other TTS services, or applications that only use ElevenLabs as one of multiple backends.
40 tools
Video Transcription Extraction
Tools that transcribe video/audio content into text format with optional summarization, translation, or subtitle generation. Does NOT include voice cloning, speaker diarization, or real-time streaming analysis.
39 tools
Real-Time Voice Translation
Tools that capture spoken language in real-time, translate it to another language, and output the result as speech or text. Focuses on live interpretation across languages. Does NOT include static text translation, document translation, or tools primarily designed for transcription without translation.
38 tools
Piper TTS Ecosystem
Tools, integrations, and extensions built around the Piper TTS system, including model training, platform ports, wrappers, and specialized implementations. Does NOT include general TTS systems, other TTS engines, or non-Piper speech synthesis tools.
38 tools
Speaker Diarization Embedding
37 tools
Whisper Transcription Apps
36 tools
AWS Polly TTS
Tools and applications for converting text to speech using AWS Polly and related cloud TTS services. Includes integrations, API wrappers, and implementations across various platforms. Does NOT include general TTS frameworks, speech recognition, or voice cloning tools.
36 tools
Twitch Chat TTS
Tools that convert Twitch chat messages to spoken audio for streamers and their audiences. Includes chat filtering, channel points integration, and stream platform adapters. Does NOT include general TTS services, non-streaming chat applications, or LLM-based response generation (unless TTS output is the primary focus).
35 tools
Sign Language Translation
Tools that convert between sign language and spoken/written language using gesture recognition, computer vision, and speech synthesis. Does NOT include general gesture control, voice commands, or accessibility features that don't involve sign language translation.
34 tools
AI-Powered eReaders
Tools for reading digital books (EPUB, PDF, manga, etc.) with integrated AI features like text-to-speech, synchronized highlighting, OCR-based translation, and interactive learning. Does NOT include general TTS engines, standalone audiobook players, or non-reading-focused applications.
33 tools
React Native Voice Libraries
React Native libraries and modules for speech recognition, text-to-speech, and voice processing on iOS and Android. Does NOT include complete voice applications, web frameworks, or non-React Native speech tools.
33 tools
TTS Dataset Creation
Tools and workflows for preparing, recording, processing, and organizing audio datasets specifically for training text-to-speech models. Does NOT include pre-built TTS datasets, TTS model training frameworks, or general speech datasets for ASR/voice cloning.
33 tools
VITS TTS Implementations
VITS-based text-to-speech models, servers, and fine-tuning projects across multiple languages. Includes VITS variants (Bert-VITS2, MB-iSTFT-VITS), API implementations, and language-specific deployments. Does NOT include non-VITS TTS engines, speech recognition, or voice cloning systems without TTS synthesis.
32 tools
PDF to Audio Conversion
Tools that convert PDF documents into audio files through text extraction and text-to-speech synthesis. Does NOT include general TTS engines, video conversion, or tools that read non-PDF text files.
32 tools
Audio Transcription Tools
31 tools
React Speech Recognition
React applications using browser-based speech-to-text APIs for voice input and transcription. Does NOT include text-to-speech, voice synthesis, backend speech processing, or non-React voice implementations.
31 tools
Voice Dictation Typing
Tools that convert speech-to-text in real-time and directly input the transcribed text into applications for typing/dictation purposes. Does NOT include translation, diarization, subtitle generation, or voice synthesis applications.
30 tools
Parakeet ASR Implementations
Open-source speech-to-text and transcription tools built on or compatible with NVIDIA Parakeet models, including local deployments, API servers, and optimized inference frameworks. Does NOT include general TTS synthesis, non-Parakeet ASR systems, or voice cloning applications.
30 tools
System TTS Wrappers
Lightweight wrappers and CLI interfaces for operating system built-in text-to-speech engines (macOS `say`, Windows SAPI, etc.). Does NOT include cloud-based TTS APIs, specialized speech synthesis libraries, or applications built on top of TTS.
30 tools
SMS Voice Integrations
Plugins and integrations for sending SMS and text-to-speech calls through third-party APIs (like seven.io, TotalVoice, InfoBip) into existing platforms and workflows. Does NOT include standalone TTS engines, speech recognition, or general voice AI assistants.
29 tools
Cross-Platform TTS Frameworks
Libraries and frameworks that provide unified APIs for accessing multiple TTS engines across different operating systems and platforms. Does NOT include standalone TTS applications, speech recognition, or language-specific TTS implementations.
29 tools
Conformer ASR Implementations
Implementations and variants of the Conformer architecture for automatic speech recognition. Does NOT include general ASR frameworks, other acoustic models, or fine-tuned models for specific languages/domains without conformer as the core architecture.
28 tools
Voice Enabled Coding Assistants
Tools that add voice input/output capabilities to AI coding assistants (Claude Code, etc.) via TTS, STT, or voice cloning. Does NOT include standalone audio tools, general voice apps, or tools without coding assistant integration.
28 tools
Text To Speech Conversion
27 tools
Whisper Framework Ports
Framework-specific implementations and bindings of Whisper.cpp for game engines, mobile platforms, and system integrations. Includes language bindings and platform-specific compilations. Does NOT include higher-level applications, UI wrappers, or server implementations that use Whisper.
27 tools
Whisper Fine-Tuning
Tools and frameworks for fine-tuning Whisper models on custom datasets, including language-specific adaptation, accent conditioning, and model distillation. Does NOT include pre-built Whisper applications, deployment wrappers, or inference optimization without training components.
27 tools
Live Caption Generation
Real-time speech-to-text transcription and subtitle display for audio/video streams, broadcasts, and live events. Does NOT include speech translation (unless paired with transcription), general ASR models, or non-real-time captioning systems.
27 tools
ASR Evaluation Metrics
Tools for measuring and analyzing the accuracy of automatic speech recognition systems through metrics like WER, CER, and DER. Does NOT include ASR models themselves, transcription services, or general audio quality assessment.
26 tools
ComfyUI TTS Nodes
Custom ComfyUI nodes and integrations for text-to-speech synthesis across multiple TTS models and engines. Does NOT include standalone TTS applications, speech recognition, voice conversion, or non-ComfyUI TTS tools.
26 tools
PHP TTS Libraries
PHP libraries and packages for text-to-speech synthesis across multiple TTS providers (Google Cloud, AWS Alexa, etc.). Does NOT include framework-specific implementations (Laravel, Yii2), non-PHP TTS tools, or speech recognition systems.
26 tools
Audio Noise Reduction
25 tools
Live Meeting Translation
Real-time speech translation and captioning for meetings, lectures, and video calls using browser APIs or dedicated platforms. Does NOT include general speech recognition, text translation tools, or non-real-time transcription services.
25 tools
Whisper Diarization
Tools that combine OpenAI Whisper (or similar ASR) with speaker diarization to identify and separate speakers in audio. Does NOT include general transcription without speaker identification, or standalone diarization tools without ASR components.
24 tools
Grapheme-to-Phoneme Conversion
Tools for converting written text (graphemes) into phonetic representations (phonemes) across languages. Includes G2P models, phonemizers, and language-specific phonetic converters. Does NOT include general text-to-speech synthesis, speech recognition, or IPA transcription editors.
24 tools
Rust Speech Recognition
Rust-based speech-to-text and audio processing libraries with local inference capabilities. Includes STT engines, noise reduction, and voice activity detection. Does NOT include cloud-dependent APIs, TTS, LLMs, or non-Rust implementations.
24 tools
Streamlit TTS Apps
Streamlit-based applications for text-to-speech conversion and speech synthesis. Includes multi-language TTS, audio playback interfaces, and speech editing tools. Does NOT include speech recognition (STT), voice cloning, specialized TTS libraries, or non-Streamlit implementations.
24 tools
Embedded TTS Systems
Lightweight text-to-speech implementations for microcontrollers and embedded devices (Arduino, ESP32, Teensy). Does NOT include cloud-based TTS services, general TTS libraries for standard computers, or speech recognition systems.
22 tools
Interactive AI Avatars
Tools for creating animated 2D/3D character avatars (Live2D, VRM) that interact via voice and text with real-time lip-sync, facial expressions, and emotional responses. Does NOT include static avatar generators, general chatbots without animation, or VTuber streaming infrastructure separate from the avatar interaction system.
22 tools
Anki TTS Integration
Tools and utilities that integrate text-to-speech capabilities into Anki flashcard decks for audio generation, playback, and language learning. Does NOT include standalone TTS engines, general speech recognition, or non-Anki language learning applications.
22 tools
OpenClaw Voice Assistants
Complete voice interface applications and integrations built on the OpenClaw platform, combining speech recognition, text-to-speech, and conversational AI. Does NOT include standalone STT/TTS tools, general voice SDKs, or non-OpenClaw voice assistants.
21 tools
Voice AI SDKs
Python SDKs and client libraries for commercial voice AI platforms (ASR, TTS, translation, voice agents). Does NOT include open-source speech recognition implementations, language models, or framework-specific integrations.
21 tools
Yandex SpeechKit Tools
SDKs, wrappers, and integrations for Yandex SpeechKit API across multiple languages and platforms. Does NOT include general speech recognition/TTS tools, other cloud providers' speech APIs, or smart home devices unrelated to SpeechKit.
21 tools
Audio Source Separation
20 tools
News Audio Bulletins
Tools that automatically gather, summarize, and convert news content into audio format for consumption via broadcasts, bots, or apps. Includes news scraping, summarization, and TTS integration. Does NOT include general content-to-podcast converters, non-news audio synthesis, or standalone TTS/ASR tools.
19 tools
Web-Based TTS Apps
Flask and web framework-based text-to-speech applications with user interfaces for converting text to speech. Includes dictionary/image text extraction features integrated into web apps. Does NOT include standalone TTS libraries, speech-to-text, or voice cloning tools.
19 tools
AI Interview Simulators
Platforms for practicing job interviews with AI interviewers using voice/video, real-time feedback, and automated scoring. Does NOT include general career coaching, resume builders, or hiring/recruitment platforms for employers.
19 tools
Image-to-Speech Synthesis
Tools that convert visual content (images, documents, video frames) into spoken audio through image captioning, optical character recognition, or visual description generation combined with text-to-speech. Does NOT include standalone OCR, image captioning without audio output, or general TTS systems without visual input processing.
19 tools
Text Normalization Engines
Tools for normalizing written text into spoken forms across languages, handling numbers, dates, abbreviations, and special characters for TTS and speech processing. Does NOT include general text-to-speech synthesis, speech recognition, or audio processing.
17 tools
Home Assistant TTS
Home Assistant integrations and plugins that add text-to-speech capabilities to Home Assistant automation platforms. Does NOT include standalone TTS engines, general voice AI tools, or non-Home Assistant smart home integrations.
17 tools
Face Recognition Systems
Authentication and access control systems that use facial recognition as the primary security mechanism, often combined with multimodal verification (voice, liveness detection, QR codes). Does NOT include general computer vision, face detection without authentication, or standalone speech/voice systems.
17 tools
IBM Watson Speech
Tools and integrations for IBM Watson's speech-to-text and text-to-speech APIs. Includes implementations across programming languages, real-time transcription, and Watson service applications. Does NOT include other cloud speech providers (AWS, Google Cloud) or general-purpose voice AI frameworks.
15 tools
Government Procurement Docs
Solicitation documents, acquisition guides, and procurement-related materials for government technology projects and services. Does NOT include actual software tools, implementations, or operational systems.
15 tools
Clipboard Text-to-Speech
Tools that monitor the clipboard and automatically read aloud any copied text using TTS engines. Does NOT include general text-to-speech, selected-text readers without clipboard integration, or voice synthesis SDKs.
15 tools
Ukrainian Voice AI
Open-source speech recognition, text-to-speech, and phonetic processing tools specifically for the Ukrainian language. Does NOT include multilingual solutions, language-agnostic frameworks, or non-Ukrainian language implementations.
13 tools
Text To Speech Tts
12 tools
Whisper Speech Transcription
12 tools
Voice Assistant Devices
12 tools
Persian Speech AI
Tools, datasets, and models specifically for Persian/Farsi speech recognition, text-to-speech, and related NLP tasks. Does NOT include general multilingual speech tools, non-Persian language resources, or speech AI for other languages.
12 tools
Audio Music Learning
11 tools
Multilingual Speech Datasets
Curated speech corpora and audio datasets across multiple languages for training ASR and speech processing models. Does NOT include text-to-speech synthesis, voice cloning, or speech recognition inference tools.
11 tools
Speech To Text Transcription
10 tools
Voice Assistant Applications
9 tools
Voice Ai Assistants
8 tools
Voice Controlled Calculators
Tools that perform mathematical calculations through voice input/commands and typically provide spoken output. Includes scientific calculators, unit converters, and math solvers with voice interfaces. Does NOT include general math tutoring apps, voice coding assistants, or text-to-speech engines without calculation functionality.
8 tools
Stt
8 tools
Ai Podcast Generation
7 tools
Conversational Chatbot Applications
7 tools
Lip Reading Synthesis
7 tools
Sign Language Recognition
7 tools
Voice Interactive Games
Web-based games and educational applications using speech recognition for user input and gameplay mechanics (number guessing, word games, pronunciation training). Does NOT include voice assistants, transcription tools, or non-interactive speech recognition systems.
7 tools
Text To Speech
6 tools
Voice Assistant Frameworks
6 tools
Multimodal Vision Language
6 tools
Voice Assistant Projects
5 tools
Tts
5 tools
Wav2Vec2 Speech Recognition
4 tools
Data Annotation Tools
4 tools
Speech Synthesis Diffusion
4 tools
Text To Video Generation
4 tools
Virtual Assistants Nlp
4 tools
Bioacoustic Species Classification
4 tools
Audio Event Classification
4 tools
Voice Ai Agents
3 tools
Speech Recognition Datasets
3 tools
Unity Ml Inference
3 tools
Deepfake Detection Systems
3 tools
Personal Assistant Rag
3 tools
Conversational Rag Agents
3 tools
Image Caption Generation
3 tools
Facial Attribute Classification
3 tools
Joke Telling Apps
Web and desktop applications that fetch jokes from APIs and use text-to-speech to narrate them aloud. Does NOT include general text-to-speech tools, voice cloning, speech recognition, or joke APIs themselves.
3 tools
Text To Speech Mcp
2 tools
Llm Scaling Architecture
2 tools
Comfyui Extensions
2 tools
Flutter Ai Chat Apps
2 tools
Multi Modal Ai Assistants
2 tools
Ai Virtual Companions
2 tools
Machine Translation Systems
2 tools
Ai Chatbot Interfaces
2 tools
Next Word Prediction
2 tools
Audio Classification Transformers
2 tools
Ai Image Generation Platforms
2 tools
Natural Language Task Scheduling
2 tools
Text Translation Tools
2 tools
Ai Workflow Automation
1 tools
Ai Assistant Platforms
1 tools
Text Embedding Runtimes
1 tools
Mediapipe Implementations
1 tools
Discord Ai Chatbots
1 tools
Vision Language Models
1 tools
Indic Language Translation
1 tools
Neural Machine Translation
1 tools
Gpt Implementation Tutorials
1 tools
Gemini Prompt Workbenches
1 tools
Speculative Decoding Algorithms
1 tools
Text Scanning Ocr
1 tools
Text Emotion Recognition
1 tools
Multi Agent Orchestration
1 tools
Llm Inference Serving
1 tools
Vibe Coding Frameworks
1 tools
Vietnamese Nlp Tools
1 tools
Respiratory Disease Detection
1 tools
Ai Terminal Agents
1 tools
Ai Note Taking Apps
1 tools
Document Qa Chatbots
1 tools
Ai Children Storytelling
1 tools
Nlp Task Libraries
1 tools
Llm Fine Tuning
1 tools
Chatbot Frameworks
1 tools
Talking Head Generation
1 tools
Gemini Api Applications
1 tools
Llm Docker Deployments
1 tools
Stress Detection Ml
1 tools
Nlp Dataset Collections
1 tools
Fullstack Ai Assistants
1 tools
Graph Database Rag
1 tools
Video Content Intelligence
1 tools
Temporal Expression Parsing
1 tools
Health App Development
1 tools
Clip Vision Language
1 tools
Ai Interview Coaching
1 tools
Hand Gesture Control
1 tools
Ml Benchmarking Frameworks
1 tools
Viral Clip Generation
1 tools
Model Compression Optimization
1 tools
Edge Camera Ml
1 tools
Ocr Document Extraction
1 tools
Go Ml Bindings
1 tools
Reading Comprehension Qa
1 tools
Tokenization Libraries
1 tools
Llm Translation Tools
1 tools
Ai Skill Integrations
1 tools
Facial Recognition Apps
1 tools
Federated Learning Frameworks
1 tools
Personal Knowledge Management
1 tools
Flashcard Generation
1 tools
Streamlit Chatbot Apps
1 tools
Ml Learning Resources
1 tools
Llm Sdk Packages
1 tools
Semantic Kernel Tools
1 tools
Embedding Model Tuning
1 tools
Llm Learning Resources
1 tools
Chatbot Nlp Frameworks
1 tools
Telegram Llm Bots
1 tools
Nlu Game Applications
1 tools
Diffusion Model Frameworks
1 tools
Image Classification Demos
1 tools