All Transformer Models
6,429 models ranked by quality score · Page 11 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 1001 |
xmindflow/Awesome-Transformer-in-Medical-Imaging
[MedIA Journal] An ultimately comprehensive paper list of Vision... |
|
Emerging |
| 1002 |
NetEase-Media/grps_trtllm
Higher performance OpenAI LLM service than vLLM serve: A pure C++... |
|
Emerging |
| 1003 |
mmaaz60/EdgeNeXt
[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently... |
|
Emerging |
| 1004 |
liuqidong07/LLM-ESR
[NeurIPS'24 Spotlight] The official implementation code of LLM-ESR. |
|
Emerging |
| 1005 |
GiovanniGatti/socratic-llm
Training pipeline for fine tuning Phi-3-mini-instruct to follow the Socratic method |
|
Emerging |
| 1006 |
SensAI-PT/LLaMa2lang
Convenience scripts to finetune (chat-)LLaMa3 and other models for any language |
|
Emerging |
| 1007 |
FreeOCR-AI/layoutreader
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order. |
|
Emerging |
| 1008 |
declare-lab/flan-alpaca
This repository contains code for extending the Stanford Alpaca synthetic... |
|
Emerging |
| 1009 |
HKUDS/OpenGraph
[EMNLP'2024] "OpenGraph: Towards Open Graph Foundation Models" |
|
Emerging |
| 1010 |
txsun1997/Black-Box-Tuning
ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'2022:... |
|
Emerging |
| 1011 |
phronmophobic/llama.clj
Run LLMs locally. A clojure wrapper for llama.cpp. |
|
Emerging |
| 1012 |
nv-tlabs/LLaMA-Mesh
Unifying 3D Mesh Generation with Language Models |
|
Emerging |
| 1013 |
tatsu-lab/alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method... |
|
Emerging |
| 1014 |
Nuked88/ComfyUI-N-Nodes
A suite of custom nodes for ConfyUI that includes GPT text-prompt... |
|
Emerging |
| 1015 |
luchangli03/export_llama_to_onnx
export llama to onnx |
|
Emerging |
| 1016 |
jlin816/dynalang
Code for "Learning to Model the World with Language." ICML 2024 Oral. |
|
Emerging |
| 1017 |
Rishit-dagli/Conformer
An implementation of Conformer: Convolution-augmented Transformer for Speech... |
|
Emerging |
| 1018 |
sangmichaelxie/doremi
Pytorch implementation of DoReMi, a method for optimizing the data mixture... |
|
Emerging |
| 1019 |
taufeeque9/codebook-features
Sparse and discrete interpretability tool for neural networks |
|
Emerging |
| 1020 |
EvilFreelancer/impruver
A set of scripts and configurations for pretraining of Large Language Models (LLM) |
|
Emerging |
| 1021 |
thuml/Flowformer
About Code release for "Flowformer: Linearizing Transformers with... |
|
Emerging |
| 1022 |
sagorbrur/bntransformer
Bengali transformer using transformers |
|
Emerging |
| 1023 |
emapco/rk-transformers
Export and Run Hugging Face Transformers Models on Rockchip NPUs |
|
Emerging |
| 1024 |
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual... |
|
Emerging |
| 1025 |
godatadriven/rhyme-with-ai
Rhyme with AI |
|
Emerging |
| 1026 |
aihao2000/DPN-LLaVA
Arxiv 25: Dynamic Pyramid Network for Efficient Multimodal Large Language Model |
|
Emerging |
| 1027 |
wgcban/HyperTransformer
[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion... |
|
Emerging |
| 1028 |
JohnMachado11/Build-a-Large-Language-Model-from-Scratch
Building a GPT-like LLM from scratch with PyTorch. |
|
Emerging |
| 1029 |
octanove/shiba
Pytorch implementation and pre-trained Japanese model for CANINE, the... |
|
Emerging |
| 1030 |
pratyushasharma/laser
The Truth Is In There: Improving Reasoning in Language Models with... |
|
Emerging |
| 1031 |
datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤 |
|
Emerging |
| 1032 |
Ethan-yt/guwenbert
GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical... |
|
Emerging |
| 1033 |
erogol/BlaGPT
Experimental playground for benchmarking language model (LM) architectures,... |
|
Emerging |
| 1034 |
skyloevil/llm-scratch-pytorch
lm-scratch-pytorch - The code is designed to be beginner-friendly, with a... |
|
Emerging |
| 1035 |
The-FinAI/CALM
A LLM training and evaluation benchmark for credit scoring |
|
Emerging |
| 1036 |
robinniesert/kaggle-google-quest
Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution |
|
Emerging |
| 1037 |
jshilong/GPT4RoI
(ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest |
|
Emerging |
| 1038 |
ymcui/PERT
PERT: Pre-training BERT with Permuted Language Model |
|
Emerging |
| 1039 |
MIV-XJTU/JanusVLN
[ICLR2026] Official implementation for "JanusVLN: Decoupling Semantics and... |
|
Emerging |
| 1040 |
varunshenoy/super-json-mode
Low latency JSON generation using LLMs ⚡️ |
|
Emerging |
| 1041 |
OpenGVLab/VisionLLM
VisionLLM Series |
|
Emerging |
| 1042 |
Eclipsess/Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large... |
|
Emerging |
| 1043 |
airaria/Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型(VisualCLA) |
|
Emerging |
| 1044 |
KolosalAI/Kolosal
Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run... |
|
Emerging |
| 1045 |
liangyuwang/zo2
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with... |
|
Emerging |
| 1046 |
adaptivetokensampling/ATS
Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral... |
|
Emerging |
| 1047 |
poloclub/LLM-Attributor
LLM Attributor: Attribute LLM's Generated Text to Training Data |
|
Emerging |
| 1048 |
TIGER-AI-Lab/QuickVideo
Quick Long Video Understanding [TMLR2025] |
|
Emerging |
| 1049 |
Yeonghun1675/L2M3
Large Language Models Material Miner |
|
Emerging |
| 1050 |
Pyenb/Ollama-models
A collection of zipped Ollama models for offline use. Simply download,... |
|
Emerging |
| 1051 |
Haiyang-W/TokenFormer
[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking... |
|
Emerging |
| 1052 |
microsoft/batch-inference
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT... |
|
Emerging |
| 1053 |
datawhalechina/llm-cookbook
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版 |
|
Emerging |
| 1054 |
scientific-discovery/LLEMA
[ICLR 2026] LLEMA: Evolutionary Search with LLMs for Multi-Objective... |
|
Emerging |
| 1055 |
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed... |
|
Emerging |
| 1056 |
kohjingyu/gill
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with... |
|
Emerging |
| 1057 |
datawhalechina/diy-llm
🎓 系统性大语言模型构建课程|🛠️ 覆盖预训练数据工程、Tokenizer、Transformer、MoE、GPU 编程... |
|
Emerging |
| 1058 |
antoyang/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional... |
|
Emerging |
| 1059 |
RobertCsordas/modules
The official repository for our paper "Are Neural Nets Modular? Inspecting... |
|
Emerging |
| 1060 |
kohjingyu/fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to... |
|
Emerging |
| 1061 |
PureBee/purebee
A GPU defined in software. Runs Llama 3.2 1B at 3.6 tok/sec. Zero dependencies. |
|
Emerging |
| 1062 |
dashstander/block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag... |
|
Emerging |
| 1063 |
huggingface/llm_training_handbook
An open collection of methodologies to help with successful training of... |
|
Emerging |
| 1064 |
TencentARC/LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion. |
|
Emerging |
| 1065 |
toyaix/TritonLLM
LLM Inference via Triton (Flexible & Modular): Focused on Kernel... |
|
Emerging |
| 1066 |
dataflowr/llm_efficiency
KV Cache & LoRA for minGPT |
|
Emerging |
| 1067 |
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning |
|
Emerging |
| 1068 |
NVIDIA/logits-processor-zoo
A collection of LogitsProcessors to customize and enhance LLM behavior for... |
|
Emerging |
| 1069 |
ChaitanyaK77/Building-a-Small-Language-Model-SLM-
This Repository provides a Jupyter Notebook for building a small language... |
|
Emerging |
| 1070 |
soda-inria/carte
Repository for CARTE: Context-Aware Representation of Table Entries |
|
Emerging |
| 1071 |
somosnlp/nlp-de-cero-a-cien
Curso práctico: NLP de cero a cien 🤗 |
|
Emerging |
| 1072 |
zjysteven/mink-plus-plus
[ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training... |
|
Emerging |
| 1073 |
Beomi/Gemma-EasyLM
Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series) |
|
Emerging |
| 1074 |
HamedBabaei/LLMs4OL
LLMs4OL: Large Language Models for Ontology Learning |
|
Emerging |
| 1075 |
princeton-nlp/CharXiv
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in... |
|
Emerging |
| 1076 |
GT4SD/zero-shot-bert-adapters
Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection. |
|
Emerging |
| 1077 |
mlpc-ucsd/BLIVA
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich... |
|
Emerging |
| 1078 |
shikiw/OPERA
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large... |
|
Emerging |
| 1079 |
fangpin/llm-from-scratch
Build LLM from scratch |
|
Emerging |
| 1080 |
Liuhong99/Sophia
The official implementation of “Sophia: A Scalable Stochastic Second-order... |
|
Emerging |
| 1081 |
sandy1990418/Finetune-Qwen2.5-VL
Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision... |
|
Emerging |
| 1082 |
KasperGroesLudvigsen/influenza_transformer
PyTorch implementation of Transformer model used in "Deep Transformer Models... |
|
Emerging |
| 1083 |
WindyLab/ConsensusLLM-code
Source code of our paper "Multi-Agent Consensus Seeking via Large Language Models". |
|
Emerging |
| 1084 |
Beomi/KcELECTRA
🤗 Korean Comments ELECTRA: 한국어 댓글로 학습한 ELECTRA 모델 |
|
Emerging |
| 1085 |
audioku/meta-transfer-learning
Implementation of meta-transfer-learning for ASR and LM (ACL 2020) |
|
Emerging |
| 1086 |
VectorInstitute/vectorlm
LLM finetuning in resource-constrained environments. |
|
Emerging |
| 1087 |
flixpar/med-ts-llm
MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis |
|
Emerging |
| 1088 |
php-llm/llm-chain
PHP library for building LLM-based and AI-based features and applications. |
|
Emerging |
| 1089 |
Guitaricet/relora
Official code for ReLoRA from the paper Stack More Layers Differently:... |
|
Emerging |
| 1090 |
bhavsarpratik/easy-transformers
Utility functions to work with transformers |
|
Emerging |
| 1091 |
gjbex/Deploying-LLMs-locally
Material for a training on AI tools |
|
Emerging |
| 1092 |
BodhiSearch/BodhiApp
Run Open Source/Open Weight LLMs locally with OpenAI compatible APIs |
|
Emerging |
| 1093 |
okuvshynov/slowllama
Finetune llama2-70b and codellama on MacBook Air without quantization |
|
Emerging |
| 1094 |
Eamon2009/Transformer-language-model
An educational implementation of a GPT-style language model built from... |
|
Emerging |
| 1095 |
tigerchen52/query_level_uncertainty
query-level uncertainty in LLMs |
|
Emerging |
| 1096 |
chensyCN/llm4ea_official
[NeurIPS‘24] LLM4EA: Entity Alignment with Noisy Annotations from Large... |
|
Emerging |
| 1097 |
xiuqhou/Salience-DETR
[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing... |
|
Emerging |
| 1098 |
fardjad/node-llmatic
Use self-hosted LLMs with an OpenAI compatible API |
|
Emerging |
| 1099 |
piresramon/gpt-4-enem
Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian... |
|
Emerging |
| 1100 |
KennethEnevoldsen/spacy-wrap
spaCy-wrap is a wrapper library for spaCy for including fine-tuned... |
|
Emerging |