Vision Transformer Classification Transformer Models
Tools and models for image classification using transformer architectures (Vision Transformers, SigLIP, BEiT, etc.). Does NOT include general image captioning, vision-language retrieval, or multi-label classification frameworks without transformer-based implementations.
There are 21 vision transformer classification models tracked. The highest-rated is QData/C-Tran at 38/100 with 280 stars.
Get all 21 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=vision-transformer-classification&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
QData/C-Tran
General Multi-label Image Classification with Transformers |
|
Emerging |
| 2 |
jesus3476/Fire-Detection-Siglip2
Fire-Detection-Siglip2 is an image classification vision-language encoder... |
|
Emerging |
| 3 |
pagraf/Seabed-Net
Quick start guide for Seabed-Net |
|
Emerging |
| 4 |
moharamfatema/graduation-project
Video vision transformers for hierarchical anomaly detection in video scenes. |
|
Experimental |
| 5 |
apollosoldier/Advanced-Classifier
The Advanced Classification Model is a deep learning-based approach for... |
|
Experimental |
| 6 |
PRITHIVSAKTHIUR/Fire-Detection-Siglip2
Fire-Detection-Siglip2 is an image classification vision-language encoder... |
|
Experimental |
| 7 |
mohsenMahmoodzadeh/image-and-text-classifier
Deep learning models(CNN, LSTM, BERT) for image and text classification task... |
|
Experimental |
| 8 |
kunjmehta/cross-modal-retrieval-food-ai
Course project for 198:536 at Rutgers University. The project is about... |
|
Experimental |
| 9 |
samibahig/Document-Image-Understanding-and-Analysis
Document Image Understanding: Analysis of 2 datasets |
|
Experimental |
| 10 |
00200200/Video-Waste-Dumping-Detection---IWDD
International Contest on Illegal Waste Dumping Detection |
|
Experimental |
| 11 |
exarchou/Food-Categorization-via-Prediction-of-Ingredients
This repository contains the source code for my Thesis in the Department of... |
|
Experimental |
| 12 |
zaaachos/Thesis-Diagnostic-Captioning
B.Sc. Thesis Deep Learning & NLP research on Medical Image Captioning |
|
Experimental |
| 13 |
nikola310/svhn_classification
Classification of house numbers |
|
Experimental |
| 14 |
opencodeiiita/Pestering-Data
Build an image classification model working with a real world dataset. |
|
Experimental |
| 15 |
PRITHIVSAKTHIUR/Gym-Workout-Classifier-SigLIP2
Gym-Workout-Classifier-SigLIP2 is an image classification vision-language... |
|
Experimental |
| 16 |
lawrenceokolo1/vit-faiss-product-recommendation
Production-grade visual product recommendation using ViT + FAISS on Amazon... |
|
Experimental |
| 17 |
AD-Archer/hugging-face-foodguesser
Food Category Classification - A Python tool that uses deep learning to... |
|
Experimental |
| 18 |
PRITHIVSAKTHIUR/Painting-126-DomainNet
Painting-126-DomainNet is an image classification vision-language encoder... |
|
Experimental |
| 19 |
karthek-git/gic
Efficient classification of Mobile Gallery Images |
|
Experimental |
| 20 |
amgawishx/dnn_vision_classifiers
End-to-end ML pipeline for trainning different DNN vision classifiers and... |
|
Experimental |
| 21 |
PRITHIVSAKTHIUR/Traffic-Density-Classification
Traffic-Density-Classification is an image classification vision-language... |
|
Experimental |