upskyy/Squeezeformer

PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)

/ 100

Emerging

Combines Temporal U-Net downsampling to reduce multi-head attention complexity on long sequences with a streamlined block design (feed-forward followed by attention or convolution), improving efficiency over the Macaron structure used in Conformer. Designed for CTC-based ASR training and integrates with the OpenSpeech framework for full pipeline support. Model outputs variable-length sequences via input/output length masking, compatible with standard PyTorch training loops.

148 stars. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 13 / 25

How are scores calculated?

Stars

148

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

khanld/chunkformer

ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription

sooftware/conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech...

WindQAQ/listen-attend-and-spell

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project...

jackaduma/LAS_Mandarin_PyTorch

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

kaituoxu/Listen-Attend-Spell

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

Explore Voice AI Tools

All categories Trending Voice AI directory Insights