yufan-aslp/AliMeeting
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
Provides modular baseline recipes for both ASR and speaker diarization tracks, with integrated voice activity detection (VAD) pipelines that generate RTTM outputs for diarization error rate (DER) evaluation. Supports training of both single-speaker and multi-speaker ASR models on multi-channel meeting audio, with character error rate (CER) as the evaluation metric. Built around the AliMeeting dataset and designed for reproducibility on the CodaLab evaluation platform.
135 stars. No commits in the last 6 months.
Stars
135
Forks
18
Language
Python
License
—
Category
Last pushed
Jun 10, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/yufan-aslp/AliMeeting"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
byjlw/video-analyzer
Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition
XnneHangLab/XnneHangLab
不会聊天的字幕提取器不是一个好 B 站下载器~
harry0703/AudioNotes
快速提取音视频内容,整理成一份结构化的markdown笔记
bakaburg1/minutemaker
Generate meeting minutes starting from an audio recording or a transcripts using speech-to-text and LLMs.
kromme/Teams-Notetaker
Let AI create the notes of your Teams Meeting