Vision Transformer Optimization ML Frameworks
Official implementations and research papers focused on improving Vision Transformer architectures through efficiency enhancements, dynamic token pruning, hierarchical designs, and architectural innovations. Does NOT include general computer vision frameworks, multimodal models, or non-transformer-based vision approaches.
There are 109 vision transformer optimization frameworks tracked. 8 score above 50 (established tier). The highest-rated is zhanghang1989/ResNeSt at 67/100 with 3,264 stars and 11,896 monthly downloads. 1 of the top 10 are actively maintained.
Get all 109 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=vision-transformer-optimization&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
zhanghang1989/ResNeSt
ResNeSt: Split-Attention Networks |
|
Established |
| 2 |
berniwal/swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch. |
|
Established |
| 3 |
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling... |
|
Established |
| 4 |
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision... |
|
Established |
| 5 |
ViTAE-Transformer/ViTPose
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer... |
|
Established |
| 6 |
sniklaus/pytorch-pwc
a reimplementation of PWC-Net in PyTorch that matches the official Caffe version |
|
Established |
| 7 |
microsoft/CvT
This is an official implementation of CvT: Introducing Convolutions to... |
|
Established |
| 8 |
gaohuang/MSDNet
Multi-Scale Dense Networks for Resource Efficient Image Classification (ICLR... |
|
Established |
| 9 |
vra/dinov2-retrieval
A cli program of image retrieval using dinov2 |
|
Emerging |
| 10 |
tobna/WhatTransformerToFavor
Github repository for the paper Which Transformer to Favor: A Comparative... |
|
Emerging |
| 11 |
Khrylx/AgentFormer
[ICCV 2021] Official PyTorch Implementation of "AgentFormer: Agent-Aware... |
|
Emerging |
| 12 |
google-research/big_transfer
Official repository for the "Big Transfer (BiT): General Visual... |
|
Emerging |
| 13 |
richzhang/PerceptualSimilarity
LPIPS metric. pip install lpips |
|
Emerging |
| 14 |
iduta/pyconv
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual... |
|
Emerging |
| 15 |
jwr1995/dc1d
A 1D implementation of a deformable convolutional layer in PyTorch with a few tricks. |
|
Emerging |
| 16 |
walsvid/CoordConv
Pytorch implementation of "An intriguing failing of convolutional neural... |
|
Emerging |
| 17 |
VicenteVivan/geo-clip
This is an official PyTorch implementation of our NeurIPS 2023 paper... |
|
Emerging |
| 18 |
bwconrad/vit-finetune
Fine-tuning Vision Transformers on various classification datasets |
|
Emerging |
| 19 |
raoyongming/DynamicViT
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with... |
|
Emerging |
| 20 |
clovaai/rexnet
Official Pytorch implementation of ReXNet (Rank eXpansion Network) with... |
|
Emerging |
| 21 |
innat/DOLG-TensorFlow
Implementation of Deep Orthogonal Fusion of Local and Global Features in TensorFlow 2 |
|
Emerging |
| 22 |
Yangzhangcst/Transformer-in-Computer-Vision
A paper list of some recent Transformer-based CV works. |
|
Emerging |
| 23 |
LeapLabTHU/DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and... |
|
Emerging |
| 24 |
kampta/DeepLayout
PyTorch implementation of "LayoutTransformer: Layout Generation and... |
|
Emerging |
| 25 |
ShirAmir/dino-vit-features
Official implementation for the paper "Deep ViT Features as Dense Visual... |
|
Emerging |
| 26 |
Renumics/mesh2vec
Turn CAE mesh data => aggregated element feature vectors for ML |
|
Emerging |
| 27 |
thuml/Xlearn
Transfer Learning Library |
|
Emerging |
| 28 |
fkodom/yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive... |
|
Emerging |
| 29 |
htdt/hyp_metric
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning |... |
|
Emerging |
| 30 |
chenhaoxing/SSFormers
This repository is the code of the paper "Sparse Spatial Transformers for... |
|
Emerging |
| 31 |
mit-han-lab/offsite-tuning
Offsite-Tuning: Transfer Learning without Full Model |
|
Emerging |
| 32 |
alon-albalak/TLiDB
Transfer Learning in Dialogue Benchmarking Toolkit |
|
Emerging |
| 33 |
ChristophReich1996/MaxViT
PyTorch reimplementation of the paper "MaxViT: Multi-Axis Vision... |
|
Emerging |
| 34 |
baraline/convst
Implementation of the Random Dilated Shapelet Transform algorithm along with... |
|
Emerging |
| 35 |
dongkyunk/DOLG-pytorch
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval... |
|
Emerging |
| 36 |
AaltoVision/DGC-Net
A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network" |
|
Emerging |
| 37 |
amazon-science/semi-vit
PyTorch implementation of Semi-supervised Vision Transformers |
|
Emerging |
| 38 |
NVlabs/FAN
Official PyTorch implementation of Fully Attentional Networks |
|
Emerging |
| 39 |
PracticumAI/transfer_learning
Transfer learning is a powerful method allowing you to repurpose an AI model... |
|
Emerging |
| 40 |
DavidLandup0/deepvision
PyTorch and TensorFlow/Keras image models with automatic weight conversions... |
|
Emerging |
| 41 |
SunghwanHong/Cost-Aggregation-transformers
Official implementation of CATs |
|
Emerging |
| 42 |
daniel-code/TubeViT
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse... |
|
Emerging |
| 43 |
FrancescoSaverioZuppichini/ViT
Implementing Vi(sion)T(transformer) |
|
Emerging |
| 44 |
bryanlimy/V1T
[TMLR 2023] V1T: Large-scale mouse V1 response prediction using a Vision Transformer |
|
Emerging |
| 45 |
ViTAE-Transformer/ViTAE-Transformer
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by... |
|
Emerging |
| 46 |
YifanXu74/Evo-ViT
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision... |
|
Emerging |
| 47 |
GuanRunwei/Awesome-Vision-Transformer-Collection
Variants of Vision Transformer and its downstream tasks |
|
Emerging |
| 48 |
MosbehBarhoumiRAI/VITON-PRE-PROCESSING
This repository contains the initial implementation of pre-processing for... |
|
Emerging |
| 49 |
AnkurDeria/MFT
Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing... |
|
Emerging |
| 50 |
xiusu/ViTAS
Code for ViTAS_Vision Transformer Architecture Search |
|
Emerging |
| 51 |
intel/transfer-learning
Libraries and tools to support Transfer Learning |
|
Emerging |
| 52 |
graldij/transformer-fusion
Official repository of the "Transformer Fusion with Optimal Transport"... |
|
Emerging |
| 53 |
johndpope/OmniTransfer-hack
OmniTransfer implementation for LTX-2 (work in progress) |
|
Emerging |
| 54 |
paulgavrikov/CNN-Filter-DB
A database of over 1.4 billion 3x3 convolution filters extracted from... |
|
Emerging |
| 55 |
shashankvkt/DoRA_ICLR24
This repo contains the official implementation of ICLR 2024 paper "Is... |
|
Emerging |
| 56 |
apple/parameterized-transforms
torchvision-based transforms that provide access to parameterization |
|
Emerging |
| 57 |
nerminnuraydogan/vision-transformer
Vision Transformer explanation and implementation with PyTorch |
|
Emerging |
| 58 |
altndrr/vic
Code implementation of our NeurIPS 2023 paper: Vocabulary-free Image Classification |
|
Emerging |
| 59 |
ViTAE-Transformer/ViTAE-VSA
The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention... |
|
Emerging |
| 60 |
billpsomas/simpool
This repo contains the official implementation of ICCV 2023 paper "Keep It... |
|
Experimental |
| 61 |
NU-CUCIS/CrossPropertyTL
Cross-property Deep Transfer Learning |
|
Experimental |
| 62 |
Rishit-dagli/Transformer-in-Transformer
An Implementation of Transformer in Transformer in TensorFlow for image... |
|
Experimental |
| 63 |
mako443/Text2Pos-CVPR2022
Code, dataset and models for our CVPR 2022 publication "Text2Pos" |
|
Experimental |
| 64 |
iduta/coconv
[ICCV W] Contextual Convolutional Neural Networks... |
|
Experimental |
| 65 |
pavlo-melnyk/mlgp-embedme
The official implementation of the "Embed Me If You Can: A Geometric... |
|
Experimental |
| 66 |
JoanaR/multi-mode-CNN-pytorch
A PyTorch implementation of the Multi-Mode CNN to reconstruct Chlorophyll-a... |
|
Experimental |
| 67 |
shikishima-TasakiLab/Involution-PyTorch
Unofficial PyTorch reimplemention of the paper "Involution: Inverting the... |
|
Experimental |
| 68 |
ViTAE-Transformer/LeMeViT
The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with... |
|
Experimental |
| 69 |
materight/RepNet-pytorch
A PyTorch port with pre-trained weights of RepNet, from "Counting Out Time:... |
|
Experimental |
| 70 |
benbergner/cropr
A token pruning method that accelerates ViTs for various tasks while... |
|
Experimental |
| 71 |
altndrr/vicss
Code implementation of our paper: Vocabulary-free Image Classification and... |
|
Experimental |
| 72 |
dimiz51/FaceViT
FaceViT: A multi-task Vision Transformer for face detection, age estimation,... |
|
Experimental |
| 73 |
insitro/ContextViT
Contextual Vision Transformers for Robust Representation Learning |
|
Experimental |
| 74 |
WalterSimoncini/fungivision
Library implementation of "No Train, all Gain: Self-Supervised Gradients... |
|
Experimental |
| 75 |
Lahdhirim/CV-human-pose-classifier-ViT-aws
Human Pose Classifier using Vision Transformers (ViT) – end-to-end pipeline... |
|
Experimental |
| 76 |
jman4162/PyTorch-Vision-Transformers-ViT
Explore fine-tuning the Vision Transformer (ViT) model for object... |
|
Experimental |
| 77 |
gianlucarloni/CoCoReco
Code base for our paper "Connectivity-Inspired Network for Context-Aware... |
|
Experimental |
| 78 |
Tejeshyewale/transfer_learning_in_Deeplearning
This project demonstrates image classification using transfer learning with... |
|
Experimental |
| 79 |
Atharv279/Transfer-Learning
Files containing projects related to Transfer Learning |
|
Experimental |
| 80 |
suous/RecNeXt
RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations |
|
Experimental |
| 81 |
alantess/transformer
Implementation of a modified vision transformer on the crypto market space |
|
Experimental |
| 82 |
ViLab-UCSD/MemSAC_ECCV2022
PyTorch code for MemSAC. To appear in ECCV 2022. |
|
Experimental |
| 83 |
EthanBnntt/tinygrad-vit
A minimalist implementation of the ViT (Vision Transformer) model, using tinygrad |
|
Experimental |
| 84 |
RohanG9929/LoFTR-in-Tensorflow
Code for our re-implementation of "LoFTR: Detector-Free Local Feature... |
|
Experimental |
| 85 |
PegHeads-Inc/PegHeads-Tutorial-4
TRANSFER LEARNING: TO CREATE A PRE-TRAINED MODEL |
|
Experimental |
| 86 |
OSU-MLB/ViT_PEFT_Vision
[CVPR'25 (Highlight)] Lessons and Insights from a Unifying Study of... |
|
Experimental |
| 87 |
EmPasLab/ExMobileVIT
ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer |
|
Experimental |
| 88 |
janaalbader28/Waste-Classification-ViT
Exploring the use of Vision Transformers (ViT) for waste classification |
|
Experimental |
| 89 |
chinefed/convolutional-set-transformer
Official implementation of the Convolutional Set Transformer (Chinello &... |
|
Experimental |
| 90 |
sanket-poojary-03/Fine-tuning-ViVit
Python script to fine tune Open source Video Vision Transformer (ViVit)... |
|
Experimental |
| 91 |
lizhh268/FSSUWNet
[IJCNN 2025 Oral] Official implementation of paper: FSSUWNet: Mitigating the... |
|
Experimental |
| 92 |
zhouchenlin2096/Awesome-Transformer-for-Vision-Recognition
A comprehensive paper list of Transformer & Attention for Vision Recognition... |
|
Experimental |
| 93 |
BobMcDear/vit-pytorch
PyTorch implementation of the vision transformer |
|
Experimental |
| 94 |
rentainhe/ViT.pytorch
The Pytorch reimplementation of Vision Transformer |
|
Experimental |
| 95 |
EvgenyKashin/non-leaking-conv
Implementation of Spectral Leakage and Rethinking the Kernel Size in CNNs in Pytorch |
|
Experimental |
| 96 |
AliKHaliliT/MobileViViT
MobileViViT, a higher dimensional adaptation of MobileViT |
|
Experimental |
| 97 |
VikramRangarajan/SIEDD
A fast coordinate-based neural video encoder |
|
Experimental |
| 98 |
zs1314/Fraesormer
【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive... |
|
Experimental |
| 99 |
kyegomez/LongVit
A simplistic pytorch implementation of LongVit using my previous... |
|
Experimental |
| 100 |
dabane-ghassan/int-lab-book
Foveated Spatial Transformers |
|
Experimental |
| 101 |
jiaowoguanren0615/DINOV2-Pytorch
This is a warehouse for DinoV2-models, based pytorch framework. |
|
Experimental |
| 102 |
eithannak29/NanoDiffVision
NanoDiffVision explores Differential Attention as a natural evolution of... |
|
Experimental |
| 103 |
nick8592/ViT-Classification-CIFAR10
This repository contains an implementation of the Vision Transformer (ViT)... |
|
Experimental |
| 104 |
MohammadRoodbari/Image-Classification
image classification with fine tuning the BEiT vision transformer on CIFAR 10 dataset |
|
Experimental |
| 105 |
lucasjvds/ViT-for-Dark-Matter-Morphology
Under the international Google Summer of Code program, the project... |
|
Experimental |
| 106 |
sntsemilio/Transfer-learning
A machine learning project focused on transfer learning techniques using... |
|
Experimental |
| 107 |
iijumanaAhmed/Waste-Classification-ViT
Exploring the use of Vision Transformers (ViT) for waste classification |
|
Experimental |
| 108 |
AriPathak/ViT-Berkley-CS198-HW4-Solution
My pytorch implemented solution to the Fall 2020 UC Berkley CS198 ViT... |
|
Experimental |
| 109 |
OmarAlsaqa/GeoViG
Implementation for GeoViG: Geometry-Aware Graph Reasoning for Mobile Vision... |
|
Experimental |