3D Vision Transformers Transformer Models

Tools for 3D computer vision tasks using transformers, including depth estimation, multi-view geometry, structure-from-motion, point cloud processing, 3D pose estimation, and novel view synthesis. Does NOT include general 2D vision tasks, 2D pose estimation, or 3D shape generation without vision inputs.

There are 83 3d vision transformers models tracked. 5 score above 50 (established tier). The highest-rated is NVlabs/MambaVision at 69/100 with 2,060 stars. 1 of the top 10 are actively maintained.

Get all 83 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=3d-vision-transformers&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 NVlabs/MambaVision

[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid...

69
Established
2 sign-language-translator/sign-language-translator

Python library & framework to build custom translators for the...

64
Established
3 kyegomez/Jamba

PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"

56
Established
4 fashn-AI/fashn-human-parser

Human parsing model for fashion and virtual try-on applications

55
Established
5 autonomousvision/transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for...

55
Established
6 kyegomez/MultiModalMamba

A novel implementation of fusing ViT with Mamba into a fast, agile, and high...

49
Emerging
7 dali92002/DocEnTR

DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022

47
Emerging
8 buaacyw/MeshAnything

[ICLR 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
9 buaacyw/MeshAnythingV2

[ICCV 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
10 linjieli222/HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for...

44
Emerging
11 wgcban/HyperTransformer

[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion...

41
Emerging
12 PediaMedAI/AggPose

[IJCAI 2022] Official PyTorch implementation of AggPose: Deep Aggregation...

40
Emerging
13 AllenXiangX/SnowflakeNet

(TPAMI 2023) Snowflake Point Deconvolution for Point Cloud Completion and...

40
Emerging
14 padeler/PE-former

2D Human Pose estimation using transformers. Implementation in Pytorch

39
Emerging
15 AyushExel/trolo

An SDK for Transformers + YOLO and other SSD family models

39
Emerging
16 ChenRocks/UNITER

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt...

39
Emerging
17 xingyizhou/GTR

Global Tracking Transformers, CVPR 2022

38
Emerging
18 hasanirtiza/PedesFormer-Transformer-Networks-For-Pedestrian-Detection

Transformer Networks for Pedestrian Detection

38
Emerging
19 icon-lab/SLATER

Official implementation of the paper: Unsupervised MRI Reconstruction via...

38
Emerging
20 jhcho99/CoFormer

[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for...

37
Emerging
21 desaixie/zeroverse

Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction...

37
Emerging
22 csiro-robotics/HOTFormerLoc

[IEEE/CVF CVPR 2025] Hierarchical Octree Transformer for Versatile Lidar...

36
Emerging
23 cgtuebingen/ua3dscancomp

Latent Uncertainty-Aware Multi-View SDF Scan Completion

36
Emerging
24 yihongXU/TransCenter

This is the official implementation of TransCenter (TPAMI). The code and...

36
Emerging
25 snktshrma/ngps_flight

Global vision positioning system for UAVs in outdoor GNSS-denied environments

34
Emerging
26 jhcho99/GSRTR

[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition...

33
Emerging
27 kyegomez/AudioMamba

Implementation of the paper: "Audio Mamba: Bidirectional State Space Model...

33
Emerging
28 XunshanMan/MVGFormer

This is the official implementation of the work presented at CVPR 2024,...

32
Emerging
29 zubair-irshad/NeRF-MAE

[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders...

32
Emerging
30 xmartlabs/spoter-embeddings

Create embeddings from sign pose videos using Transformers

32
Emerging
31 kyegomez/MambaDecoderBlock

MambaDecoderBlock is a novel decoder architecture that replaces traditional...

31
Emerging
32 VachanVY/Transfusion.torch

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse...

30
Emerging
33 hukenovs/slovo

Slovo: Russian Sign Language Dataset and Models

30
Emerging
34 eslambakr/LAR-Look-Around-and-Refer

This is the official implementation for our paper;"LAR:Look Around and Refer".

29
Experimental
35 tthinking/MATR

[IEEE TIP 2022] Official implementation of MATR: Multimodal Medical Image...

29
Experimental
36 sauradip/STALE

[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot...

29
Experimental
37 Warren-SJ/SLAM3R

A study of the research paper SLAM3R:Real-Time Dense Scene Reconstruction...

28
Experimental
38 DEV-D-GR8/SignSense

This repository contains a transformer-based model for real-time American...

28
Experimental
39 sam575/axial-gan

Code for "Simultaneous Face Hallucination and Translation for Thermal to...

28
Experimental
40 kyegomez/VLM-Mamba

We introduce VLM-Mamba, the first Vision-Language Model built entirely on...

26
Experimental
41 kyegomez/SimpleMamba

Implementation of a modular, high-performance, and simplistic mamba for...

26
Experimental
42 AndrewBoessen/PerfectRep

PerfectRep is a 3D pose estimation model tailored specifically for...

26
Experimental
43 ShengcaiLiao/TransMatcher

[NeurIPS 2021] TransMatcher: Deep Image Matching Through Transformers for...

25
Experimental
44 Suvroneel/ToyKing

A Python prototype that converts 2D photos or text prompts into 3D models...

24
Experimental
45 bhanuprathap2000/sign-language-recognition

This repo contains the code for sign-language-recognition as part of our...

24
Experimental
46 Merterm/Modeling-Intensification-for-SLG

Public repo for the paper: "Modeling Intensification for Sign Language...

24
Experimental
47 NeurAI-Lab/MT-SfMLearner

Official code for 'Transformers in Unsupervised Structure-from-Motion' and...

24
Experimental
48 GregorKobsik/ImageTransformer

This notebook shows a basic implementation of a transformer (decoder)...

23
Experimental
49 kyegomez/Simba

A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified...

23
Experimental
50 xiuqhou/DAPE

[AAAI2026] Official implementation of the paper "DAPE: Harmonizing...

23
Experimental
51 lamm-mit/FieldCompleter

GAN/convolutional and Transformer models to predict missing mechanical...

22
Experimental
52 loubnabnl/Sign-Segmentation-with-Transformers

Detection of temporal boundaries in sign language videos, as part of the...

22
Experimental
53 anupvna/street-view-geolocation

Multi-view Deep Learning pipeline using PyTorch to predict global...

22
Experimental
54 HowieMa/PPT

[ECCV 2022] "PPT: token-Pruned Pose Transformer for monocular and multi-view...

21
Experimental
55 LookUpMark/dylem-grid

DYLEM-GRID is a deep learning project for dynamic hand gesture recognition...

20
Experimental
56 sauradip/fewshotQAT

[BMVC 2021]: Official PyTorch implementation of : "Few Shot Temporal Action...

19
Experimental
57 arafathosense/Real-Time-Face-Glitch-Effect-Controlled-by-Hand-Gestures

A real-time interactive computer vision art project using OpenCV. Control a...

19
Experimental
58 Abdullah-Shah-26/Sign-Cast

Real-time AI-powered voice-to-sign language translator. Converts speech to...

19
Experimental
59 freddxvill/Proyecto_Traductor_de_la_LSB

Traductor de Lengua de Señas Boliviana (LSB) a texto utilizando redes...

18
Experimental
60 exitudio/GaitMixer

Official repository for "GaitMixer: Skeleton-based Gait Representation...

18
Experimental
61 icon-lab/TranSMS

Official Implementation of Transformers for System Matrix Super-resolution (TranSMS)

17
Experimental
62 albrateanu/KANT

[Sensors 2025] Enhancing Low-Light Images with Kolmogorov–Arnold Networks in...

16
Experimental
63 musialski-lab/LayoutEnhancer

Source code for the Paper: Layout Enahancer

16
Experimental
64 mabdn/feasible-interpretable-trajectory-prediction

A Transformer neural network for autonomous driving to predict the future...

15
Experimental
65 artem-gorodetskii/TransPix2Pix

Rethinking the Pix2Pix architecture with attention mechanisms and transformers.

15
Experimental
66 AshutoshKulkarni4998/AIDTransformer

Inference code for "Aerial Image Dehazing with Attentive Deformable...

15
Experimental
67 rukmini-17/scalable-sequence-modeling

Comparative analysis of Mamba vs. Transformers trained from scratch....

15
Experimental
68 mustafa1728/Person-Re-ID

Experiments on some existing Re-ID methods on a different dataset with...

15
Experimental
69 Suvroneel/Forma-3D-Vision-Engine

Converts 2D photos into 3D meshes using monocular depth estimation and...

15
Experimental
70 RisabBiswas/T2T-BinFormer

SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image...

14
Experimental
71 fabiosilva781/top-cvpr-2025-papers

🌟 Discover top CVPR 2025 papers for insightful research in computer vision,...

14
Experimental
72 Microsatellites-and-Space-Microsystems/pose_estimation_domain_gap

Two methods for solving domain gap in satellite pose estimation in space...

14
Experimental
73 gmongaras/2Mamba2Furious

Code for the paper "2Mamba2Furious: Linear in complexity, competitive in accuracy"

14
Experimental
74 miaodd98/ITrans-Generative-Image-Inpainting-with-Transformers-ChinaMM-2023-Multimedia-Systems

ITrans: Generative Image Inpainting with Transformers, ChinaMM 2023,...

13
Experimental
75 shayanamir0/Just-Image-Transformers

implementation of Just Image Transformer from the paper "Back to Basics: Let...

13
Experimental
76 tthinking/SETFusion

[PR 2026] Official implementation of SETFusion: A Semantic Transformer for...

12
Experimental
77 GregorKobsik/Octree-Transformer

Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically...

12
Experimental
78 zwh0527/AGRNet

Code for "Mining Global Relativity Consistency without Neighborhood Modeling...

12
Experimental
79 junayed-hasan/spontaneous-smile-recognition

A deep learning framework for distinguishing spontaneous from posed smiles...

12
Experimental
80 aliebayani/TransGAN-DX

A Hybrid Transformer-GAN Approach for Cardiovascular Disease Diagnosis

12
Experimental
81 botmahn/slowfast

An unofficial pytorch implementation of "Early Anticipation of Driving...

11
Experimental
82 n1ghtf4l1/decipher-engine

Detect and Translate American Sign Language (ASL) fingerspelling into text.

10
Experimental
83 codedmachine111/Image_generation_using_transformers_in_GANs

Image Generation using Transformers in GANs

10
Experimental

Comparisons in this category