kyegomez/VisionMamba

Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images

/ 100

Established

Bidirectional state space models replace traditional attention mechanisms to achieve linear complexity, enabling efficient processing of high-resolution images with significantly reduced memory footprint. The architecture uses patch embedding with configurable depth and state dimensions, integrated with PyTorch for straightforward model instantiation and forward pass inference. Supports flexible configuration across vision tasks with parameters for model capacity (dim, depth), state representation (d_state, dt_rank), and image processing (patch_size, image_size).

482 stars.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

482

Forks

Language

Python

License

MIT

Related frameworks

kkakkkka/MambaTalk

[NeurlPS-2024] The official code of MambaTalk: Efficient Holistic Gesture Synthesis with...

SiavashShams/ssamba

[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning...

kaistmm/Audio-Mamba-AuM

Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio...

FarnoushRJ/MambaLRP

[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space...

zs1314/SkinMamba

【ACCVW2025 Oral】Offical Pytorch Code for "SkinMamba: A Precision Skin Lesion Segmentation...

Explore ML Frameworks

All categories Trending ML Framework directory Insights