Speech-To-Text Transcription NLP Tools

Tools for converting audio and video speech into text transcripts using ASR models like Whisper. Includes applications for transcription, subtitle generation, and multi-language support. Does NOT include text-to-speech synthesis, voice cloning, or post-transcription NLP analysis as primary focus.

There are 34 speech-to-text transcription tools tracked. The highest-rated is 512z/podlens at 46/100 with 5 stars and 117 monthly downloads.

Get all 34 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=speech-to-text-transcription&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 512z/podlens

Free Podwise: AI Podcast & Youtube Transcription & Understanding Agent |...

46
Emerging
2 AEmotionStudio/ComfyUI-FFMPEGA

Intelligent FFMPEG agent node for ComfyUI - transforms natural language...

38
Emerging
3 Sammybams/whisper-to-text-with-azure

A telegram bot that performs transcription, translation and summarization on...

35
Emerging
4 HKAB/whisper-finetune-vietnamese

Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM

33
Emerging
5 MLH-Fellowship/transcribio

A web application that allows educators to easily generate transcripts for...

27
Experimental
6 salihfurkaan/AutoSub-CLI

A tool that simplifies the process of adding subtitles to videos by...

26
Experimental
7 RizhongLin/PolyglotWhisperer

Transcribe, translate, and learn — Whisper + LLM video pipeline with dual...

26
Experimental
8 ins8ai/wer

Word Error Rate computation using components from huggingface-evaluate and...

23
Experimental
9 JericoMeca/parakeet-tdt-0.6b-v2-Batch-Transcriber

🎙️ Transcribe audio efficiently with Parakeet Batch Transcriber, producing...

23
Experimental
10 stevenlawton/GPT-Whisper-captions

Automate subtitle generation for videos using OpenAI's Whisper API and...

22
Experimental
11 michael-borck/deep-talk

Transcribes and analyzes audio/video conversations locally with AI-powered insights.

20
Experimental
12 Jayem-11/Swahili_speech_to_text

Speech to Text for Swahili Language with Whisper-small.

20
Experimental
13 somosnlp/wav2vec2-spanish

Pre-train a Spanish Wav2Vec2 model using the Spanish portion of the Common...

18
Experimental
14 LexMainye/Kasuku-Transcriber

A speech to text web app for people with speech impairments that has support...

16
Experimental
15 eray-yuztyurk/python-ai-audio-transcriber-summarizer

AI-powered tool for fast, accurate audio transcription and summarization....

16
Experimental
16 vincenthuang75025/chinglish

Chrome extension for translating highlighted English text into Chinglish (a...

16
Experimental
17 uqqu/sync_book

audiobook generator with smart personalized translation

15
Experimental
18 Samarth-S-Shetty/caption_generator

Full-stack AI web app that automatically generates and burns captions into...

15
Experimental
19 stellarloop/video2text

Python API & command-line tool to easily transcribe speech-based video files...

15
Experimental
20 HMByteSensei/WhisperAI-Evaluation

Comprehensive benchmark of OpenAI Whisper models for Bosnian, Croatian, and...

15
Experimental
21 R3DK3LL/VocalFLow

Your voice - VocalFlow dictation, harnessing Whisper and faster-whisper for...

14
Experimental
22 danielsobrado/audio-processor

Audio processor, focused on english and arabic with diarization and summarization

13
Experimental
23 antarades/emotion-aware-automatic-speech-recognition

An intelligent speech recognition system that combines OpenAI's Whisper for...

13
Experimental
24 MaharshPatelX/Speechitive

A Video analytics tool converting videos to M3U8 playlists using HLS and...

12
Experimental
25 Jyotibrat/Speech-To-Text

Speech to Text model

12
Experimental
26 bivex/voice_to_text

A Python application for real-time Russian voice-to-text transcription and...

11
Experimental
27 singleshade8/japanese-subtitle-generator

GPU-accelerated Japanese → English subtitle generator using faster-whisper...

11
Experimental
28 felix-murcia/transcriberapp

TranscriberApp es una herramienta modular diseñada para procesar audio y...

11
Experimental
29 damiangohrh123/videotto-video-clips

Semantic video clip ranking system built with FastAPI, React, and GPT-4o.

11
Experimental
30 xubaiyi88-ai/lecture-ai-toolkit

NLP-based CLI tool that converts lecture transcripts into keywords,...

11
Experimental
31 ChaoticByte/audio-summarize

An audio summarizer (faster-whisper and BART glued together)

11
Experimental
32 Think-A-Move/SPEAR-SDK-Python-Linux

SPEAR-ASR and SPEAR-WakeUp Software Development Kit in Python for Linux

10
Experimental
33 udit-rawat/whisper-space

An ASR Gradio GUI based project that transcript the audion and provides NLP...

10
Experimental
34 solveditnpc/zonos-audiobook

Zonos-v0.1 text-to-speech(TTS) model trained on more than 200k hours of...

10
Experimental