kahne/SpeechTransProgress
Tracking the progress in end-to-end speech translation
Comprehensive resource compiling multilingual speech translation datasets (CoVoST 2, CVSS, mTEDx, MUST-C) spanning 20+ language pairs with both text and speech targets, alongside implementations in major frameworks like ESPNet-ST and Fairseq S2T. Tracks benchmark progress through peer-reviewed papers and tutorials covering direct speech-to-text translation without intermediate ASR, end-to-end architectures that jointly model acoustic and linguistic knowledge. Provides curated bibliography of recent advances including large language model fine-tuning approaches and comparative evaluations across diverse domain corpora ranging from TED talks to parliamentary proceedings.
261 stars. No commits in the last 6 months.
Stars
261
Forks
25
Language
—
License
CC0-1.0
Category
Last pushed
Oct 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/kahne/SpeechTransProgress"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
voicegain/platform
Voicegain Enterprise Speech-to-Text Platform (API, Portal, etc.)
davidamacey/OpenTranscribe
Self-hosted AI-powered transcription platform with speaker diarization, search, and...
aws-samples/amazon-transcribe-live-call-analytics
Amazon Transcribe Live Call Analytics (LCA) Sample Solution
SamirPaulb/real-time-voice-translator
A desktop application that uses AI to translate voice between languages in real time, while...
jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10...