khanld/chunkformer
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
Implements chunk-wise processing with relative right context and Masked Batch technique to eliminate padding overhead, enabling transcription of audio up to 16 hours on memory-constrained GPUs. Provides both RNN-T and CTC decoder variants via Hugging Face, supporting streaming and non-streaming ASR alongside speech classification tasks. Offers Python API and CLI interfaces for single-file and batch transcription with configurable chunk sizes and context windows.
Available on PyPI.
Stars
78
Forks
21
Language
Python
License
—
Category
Last pushed
Feb 13, 2026
Monthly downloads
392
Commits (30d)
0
Dependencies
18
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/khanld/chunkformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech...
upskyy/Squeezeformer
PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech...
WindQAQ/listen-attend-and-spell
Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project...
jackaduma/LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
TeaPoly/Conformer-Athena
Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena.