kyegomez/VisionMamba

Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images

50
/ 100
Established

Bidirectional state space models replace traditional attention mechanisms to achieve linear complexity, enabling efficient processing of high-resolution images with significantly reduced memory footprint. The architecture uses patch embedding with configurable depth and state dimensions, integrated with PyTorch for straightforward model instantiation and forward pass inference. Supports flexible configuration across vision tasks with parameters for model capacity (dim, depth), state representation (d_state, dt_rank), and image processing (patch_size, image_size).

482 stars.

No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

482

Forks

21

Language

Python

License

MIT

Last pushed

Mar 09, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kyegomez/VisionMamba"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.