upskyy/Squeezeformer
PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)
Combines Temporal U-Net downsampling to reduce multi-head attention complexity on long sequences with a streamlined block design (feed-forward followed by attention or convolution), improving efficiency over the Macaron structure used in Conformer. Designed for CTC-based ASR training and integrates with the OpenSpeech framework for full pipeline support. Model outputs variable-length sequences via input/output length masking, compatible with standard PyTorch training loops.
148 stars. No commits in the last 6 months. Available on PyPI.
Stars
148
Forks
16
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 22, 2022
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/upskyy/Squeezeformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
khanld/chunkformer
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
sooftware/conformer
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech...
WindQAQ/listen-attend-and-spell
Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project...
jackaduma/LAS_Mandarin_PyTorch
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
kaituoxu/Listen-Attend-Spell
A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.