All Transformer Models

6,429 models ranked by quality score · Page 11 of 65

Showing 1001–1100 of 6,429
# Model Score Tier
1001 xmindflow/Awesome-Transformer-in-Medical-Imaging

[MedIA Journal] An ultimately comprehensive paper list of Vision...

42
Emerging
1002 NetEase-Media/grps_trtllm

Higher performance OpenAI LLM service than vLLM serve: A pure C++...

42
Emerging
1003 mmaaz60/EdgeNeXt

[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently...

42
Emerging
1004 liuqidong07/LLM-ESR

[NeurIPS'24 Spotlight] The official implementation code of LLM-ESR.

42
Emerging
1005 GiovanniGatti/socratic-llm

Training pipeline for fine tuning Phi-3-mini-instruct to follow the Socratic method

42
Emerging
1006 SensAI-PT/LLaMa2lang

Convenience scripts to finetune (chat-)LLaMa3 and other models for any language

42
Emerging
1007 FreeOCR-AI/layoutreader

A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.

42
Emerging
1008 declare-lab/flan-alpaca

This repository contains code for extending the Stanford Alpaca synthetic...

42
Emerging
1009 HKUDS/OpenGraph

[EMNLP'2024] "OpenGraph: Towards Open Graph Foundation Models"

42
Emerging
1010 txsun1997/Black-Box-Tuning

ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'2022:...

42
Emerging
1011 phronmophobic/llama.clj

Run LLMs locally. A clojure wrapper for llama.cpp.

42
Emerging
1012 nv-tlabs/LLaMA-Mesh

Unifying 3D Mesh Generation with Language Models

42
Emerging
1013 tatsu-lab/alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method...

42
Emerging
1014 Nuked88/ComfyUI-N-Nodes

A suite of custom nodes for ConfyUI that includes GPT text-prompt...

42
Emerging
1015 luchangli03/export_llama_to_onnx

export llama to onnx

42
Emerging
1016 jlin816/dynalang

Code for "Learning to Model the World with Language." ICML 2024 Oral.

42
Emerging
1017 Rishit-dagli/Conformer

An implementation of Conformer: Convolution-augmented Transformer for Speech...

42
Emerging
1018 sangmichaelxie/doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture...

42
Emerging
1019 taufeeque9/codebook-features

Sparse and discrete interpretability tool for neural networks

42
Emerging
1020 EvilFreelancer/impruver

A set of scripts and configurations for pretraining of Large Language Models (LLM)

42
Emerging
1021 thuml/Flowformer

About Code release for "Flowformer: Linearizing Transformers with...

42
Emerging
1022 sagorbrur/bntransformer

Bengali transformer using transformers

41
Emerging
1023 emapco/rk-transformers

Export and Run Hugging Face Transformers Models on Rockchip NPUs

41
Emerging
1024 FoundationVision/Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual...

41
Emerging
1025 godatadriven/rhyme-with-ai

Rhyme with AI

41
Emerging
1026 aihao2000/DPN-LLaVA

Arxiv 25: Dynamic Pyramid Network for Efficient Multimodal Large Language Model

41
Emerging
1027 wgcban/HyperTransformer

[CVPR'22] HyperTransformer: A Textural and Spectral Feature Fusion...

41
Emerging
1028 JohnMachado11/Build-a-Large-Language-Model-from-Scratch

Building a GPT-like LLM from scratch with PyTorch.

41
Emerging
1029 octanove/shiba

Pytorch implementation and pre-trained Japanese model for CANINE, the...

41
Emerging
1030 pratyushasharma/laser

The Truth Is In There: Improving Reasoning in Language Models with...

41
Emerging
1031 datadreamer-dev/DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

41
Emerging
1032 Ethan-yt/guwenbert

GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical...

41
Emerging
1033 erogol/BlaGPT

Experimental playground for benchmarking language model (LM) architectures,...

41
Emerging
1034 skyloevil/llm-scratch-pytorch

lm-scratch-pytorch - The code is designed to be beginner-friendly, with a...

41
Emerging
1035 The-FinAI/CALM

A LLM training and evaluation benchmark for credit scoring

41
Emerging
1036 robinniesert/kaggle-google-quest

Google QUEST Q&A Labeling Kaggle Competition 6th Place Solution

41
Emerging
1037 jshilong/GPT4RoI

(ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

41
Emerging
1038 ymcui/PERT

PERT: Pre-training BERT with Permuted Language Model

41
Emerging
1039 MIV-XJTU/JanusVLN

[ICLR2026] Official implementation for "JanusVLN: Decoupling Semantics and...

41
Emerging
1040 varunshenoy/super-json-mode

Low latency JSON generation using LLMs ⚡️

41
Emerging
1041 OpenGVLab/VisionLLM

VisionLLM Series

41
Emerging
1042 Eclipsess/Awesome-Efficient-Reasoning-LLMs

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large...

41
Emerging
1043 airaria/Visual-Chinese-LLaMA-Alpaca

多模态中文LLaMA&Alpaca大语言模型(VisualCLA)

41
Emerging
1044 KolosalAI/Kolosal

Kolosal AI is an OpenSource and Lightweight alternative to LM Studio to run...

41
Emerging
1045 liangyuwang/zo2

ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with...

41
Emerging
1046 adaptivetokensampling/ATS

Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral...

41
Emerging
1047 poloclub/LLM-Attributor

LLM Attributor: Attribute LLM's Generated Text to Training Data

41
Emerging
1048 TIGER-AI-Lab/QuickVideo

Quick Long Video Understanding [TMLR2025]

41
Emerging
1049 Yeonghun1675/L2M3

Large Language Models Material Miner

41
Emerging
1050 Pyenb/Ollama-models

A collection of zipped Ollama models for offline use. Simply download,...

41
Emerging
1051 Haiyang-W/TokenFormer

[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking...

41
Emerging
1052 microsoft/batch-inference

Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT...

41
Emerging
1053 datawhalechina/llm-cookbook

面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版

41
Emerging
1054 scientific-discovery/LLEMA

[ICLR 2026] LLEMA: Evolutionary Search with LLMs for Multi-Objective...

41
Emerging
1055 NVlabs/DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed...

41
Emerging
1056 kohjingyu/gill

🐟 Code and models for the NeurIPS 2023 paper "Generating Images with...

41
Emerging
1057 datawhalechina/diy-llm

🎓 系统性大语言模型构建课程|🛠️ 覆盖预训练数据工程、Tokenizer、Transformer、MoE、GPU 编程...

41
Emerging
1058 antoyang/FrozenBiLM

[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional...

41
Emerging
1059 RobertCsordas/modules

The official repository for our paper "Are Neural Nets Modular? Inspecting...

41
Emerging
1060 kohjingyu/fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to...

41
Emerging
1061 PureBee/purebee

A GPU defined in software. Runs Llama 3.2 1B at 3.6 tok/sec. Zero dependencies.

41
Emerging
1062 dashstander/block-recurrent-transformer

Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag...

41
Emerging
1063 huggingface/llm_training_handbook

An open collection of methodologies to help with successful training of...

41
Emerging
1064 TencentARC/LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

41
Emerging
1065 toyaix/TritonLLM

LLM Inference via Triton (Flexible & Modular): Focused on Kernel...

41
Emerging
1066 dataflowr/llm_efficiency

KV Cache & LoRA for minGPT

41
Emerging
1067 princeton-nlp/LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

41
Emerging
1068 NVIDIA/logits-processor-zoo

A collection of LogitsProcessors to customize and enhance LLM behavior for...

41
Emerging
1069 ChaitanyaK77/Building-a-Small-Language-Model-SLM-

This Repository provides a Jupyter Notebook for building a small language...

41
Emerging
1070 soda-inria/carte

Repository for CARTE: Context-Aware Representation of Table Entries

41
Emerging
1071 somosnlp/nlp-de-cero-a-cien

Curso práctico: NLP de cero a cien 🤗

41
Emerging
1072 zjysteven/mink-plus-plus

[ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training...

41
Emerging
1073 Beomi/Gemma-EasyLM

Train GEMMA on TPU/GPU! (Codebase for training Gemma-Ko Series)

41
Emerging
1074 HamedBabaei/LLMs4OL

LLMs4OL:‌ Large Language Models for Ontology Learning

41
Emerging
1075 princeton-nlp/CharXiv

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in...

41
Emerging
1076 GT4SD/zero-shot-bert-adapters

Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.

41
Emerging
1077 mlpc-ucsd/BLIVA

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich...

41
Emerging
1078 shikiw/OPERA

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large...

41
Emerging
1079 fangpin/llm-from-scratch

Build LLM from scratch

41
Emerging
1080 Liuhong99/Sophia

The official implementation of “Sophia: A Scalable Stochastic Second-order...

41
Emerging
1081 sandy1990418/Finetune-Qwen2.5-VL

Fine-tuning Qwen2.5-VL for vision-language tasks | Optimized for Vision...

41
Emerging
1082 KasperGroesLudvigsen/influenza_transformer

PyTorch implementation of Transformer model used in "Deep Transformer Models...

41
Emerging
1083 WindyLab/ConsensusLLM-code

Source code of our paper "Multi-Agent Consensus Seeking via Large Language Models".

41
Emerging
1084 Beomi/KcELECTRA

🤗 Korean Comments ELECTRA: 한국어 댓글로 학습한 ELECTRA 모델

41
Emerging
1085 audioku/meta-transfer-learning

Implementation of meta-transfer-learning for ASR and LM (ACL 2020)

41
Emerging
1086 VectorInstitute/vectorlm

LLM finetuning in resource-constrained environments.

41
Emerging
1087 flixpar/med-ts-llm

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

41
Emerging
1088 php-llm/llm-chain

PHP library for building LLM-based and AI-based features and applications.

41
Emerging
1089 Guitaricet/relora

Official code for ReLoRA from the paper Stack More Layers Differently:...

41
Emerging
1090 bhavsarpratik/easy-transformers

Utility functions to work with transformers

41
Emerging
1091 gjbex/Deploying-LLMs-locally

Material for a training on AI tools

41
Emerging
1092 BodhiSearch/BodhiApp

Run Open Source/Open Weight LLMs locally with OpenAI compatible APIs

41
Emerging
1093 okuvshynov/slowllama

Finetune llama2-70b and codellama on MacBook Air without quantization

41
Emerging
1094 Eamon2009/Transformer-language-model

An educational implementation of a GPT-style language model built from...

41
Emerging
1095 tigerchen52/query_level_uncertainty

query-level uncertainty in LLMs

41
Emerging
1096 chensyCN/llm4ea_official

[NeurIPS‘24] LLM4EA: Entity Alignment with Noisy Annotations from Large...

41
Emerging
1097 xiuqhou/Salience-DETR

[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing...

41
Emerging
1098 fardjad/node-llmatic

Use self-hosted LLMs with an OpenAI compatible API

41
Emerging
1099 piresramon/gpt-4-enem

Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian...

41
Emerging
1100 KennethEnevoldsen/spacy-wrap

spaCy-wrap is a wrapper library for spaCy for including fine-tuned...

41
Emerging
« Prev 1 2 3 9 10 11 12 13 63 64 65 Next »