kyegomez/VisionMamba
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images
Bidirectional state space models replace traditional attention mechanisms to achieve linear complexity, enabling efficient processing of high-resolution images with significantly reduced memory footprint. The architecture uses patch embedding with configurable depth and state dimensions, integrated with PyTorch for straightforward model instantiation and forward pass inference. Supports flexible configuration across vision tasks with parameters for model capacity (dim, depth), state representation (d_state, dt_rank), and image processing (patch_size, image_size).
482 stars.
Stars
482
Forks
21
Language
Python
License
MIT
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/kyegomez/VisionMamba"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
kkakkkka/MambaTalk
[NeurlPS-2024] The official code of MambaTalk: Efficient Holistic Gesture Synthesis with...
SiavashShams/ssamba
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning...
kaistmm/Audio-Mamba-AuM
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio...
FarnoushRJ/MambaLRP
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space...
zs1314/SkinMamba
【ACCVW2025 Oral】Offical Pytorch Code for "SkinMamba: A Precision Skin Lesion Segmentation...