All Transformer Models

6,429 models ranked by quality score · Page 13 of 65

Showing 1201–1300 of 6,429
# Model Score Tier
1201 iverly/llamafile-docker

Distribute and run llamafile/LLMs with a single docker image.

40
Emerging
1202 xjywhu/Awesome-Multimodal-LLM-for-Code

Multimodal Large Language Models for Code Generation under Multimodal Scenarios

40
Emerging
1203 ariya/chat-llm

Chat with an LLM

40
Emerging
1204 jianzhnie/awesome-instruction-datasets

A collection of awesome-prompt-datasets, awesome-instruction-dataset, to...

40
Emerging
1205 awaescher/llmaid

Mass-edit files with LLMs

40
Emerging
1206 swordlidev/Efficient-Multimodal-LLMs-Survey

Efficient Multimodal Large Language Models: A Survey

40
Emerging
1207 kyegomez/MHMoE

Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch

40
Emerging
1208 YadaYuki/transformer-from-scratch

Transformer from scratch 🙊 (English to Japanese Translator by PyTorch)

40
Emerging
1209 yuanzhoulvpi2017/DocumentSearch

基于sentence transformers和chatglm实现的文档搜索工具

40
Emerging
1210 tpoisonooo/llama.onnx

LLaMa/RWKV onnx models, quantization and testcase

40
Emerging
1211 dingo-actual/infini-transformer

PyTorch implementation of Infini-Transformer from "Leave No Context Behind:...

40
Emerging
1212 KolosalAI/kolosal-cli

Super lightweight Ollama + Qwen Code alternative to run Llama 3.3,...

40
Emerging
1213 TIGER-AI-Lab/Pixel-Reasoner

Pixel-Level Reasoning Model trained with RL [NeuIPS25]

40
Emerging
1214 skit-ai/SpeechLLM

This repository contains the training, inference, evaluation code for...

40
Emerging
1215 neulab/knn-transformers

PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling...

40
Emerging
1216 monologg/KoBigBird

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

40
Emerging
1217 ymcui/Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)

40
Emerging
1218 QwenLM/Qwen2.5-Math

A series of math-specific large language models of our Qwen2 series.

40
Emerging
1219 GAIR-NLP/ProX

[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality...

40
Emerging
1220 ariannamethod/nanollama

Train Llama 3 models from scratch. Any scale, any personality. By Arianna Method.

40
Emerging
1221 JackZeng0208/llama.cpp-android-tutorial

llama.cpp tutorial on Android phone

40
Emerging
1222 FoundationVision/UniTok

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

40
Emerging
1223 zeyadusf/LLMs-from-Scratch

Build a Large Language Model (From Scratch) book and Finetuned Models

40
Emerging
1224 AllenXiangX/SnowflakeNet

(TPAMI 2023) Snowflake Point Deconvolution for Point Cloud Completion and...

40
Emerging
1225 VPGTrans/VPGTrans

Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA,...

40
Emerging
1226 ShiZhengyan/DePT

[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed...

40
Emerging
1227 amazon-science/unified-ept

A Unified Efficient Pyramid Transformer for Semantic Segmentation, ICCVW 2021

40
Emerging
1228 torchspec-project/TorchSpec

A PyTorch native library for training speculative decoding models

40
Emerging
1229 google-research/magvit

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

40
Emerging
1230 stanleylsx/llms_tool

一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、D...

40
Emerging
1231 nlpodyssey/cybertron

Cybertron: the home planet of the Transformers in Go

40
Emerging
1232 donaldafeith/Pytorch_Merge

Merge LLM that are split in to parts

40
Emerging
1233 OpenLemur/Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

40
Emerging
1234 YuweiYin/FinPT

FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models

40
Emerging
1235 MetaGLM/FinGLM

FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。

40
Emerging
1236 prajjwal1/fluence

A deep learning library based on Pytorch focussed on low resource language...

40
Emerging
1237 geobrain-ai/geogalactica

Code and datasets for paper "GeoGalactica: A Scientific Large Language Model...

40
Emerging
1238 rafiepour/CTran

Complete code for the proposed CNN-Transformer model for natural language...

39
Emerging
1239 Lupin1998/Awesome-MIM

[Survey] Masked Modeling for Self-supervised Representation Learning on...

39
Emerging
1240 nicola-decao/KnowledgeEditor

Code for Editing Factual Knowledge in Language Models

39
Emerging
1241 HKUDS/GraphEdit

"GraphEdit: Large Language Models for Graph Structure Learning"

39
Emerging
1242 LlamaFamily/Llama-Chinese

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

39
Emerging
1243 declare-lab/red-instruct

Codes and datasets of the paper Red-Teaming Large Language Models using...

39
Emerging
1244 Sea-Snell/CALM-Dialogue

Official code for the paper "Context-Aware Language Modeling for...

39
Emerging
1245 James-QiuHaoran/LLM-serving-with-proxy-models

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length...

39
Emerging
1246 yifanzhang-pro/AutoMathText

[ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative...

39
Emerging
1247 ylsung/VL_adapter

PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for...

39
Emerging
1248 FareedKhan-dev/create-million-parameter-llm-from-scratch

Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.

39
Emerging
1249 horseee/Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

39
Emerging
1250 soldni/pyterrier_sentence_transformers

Create PyTerrier compatible dense indices using any sentence_transformers model

39
Emerging
1251 Arkapravo-Ghosh/speech-to-text

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI

39
Emerging
1252 microsoft/COCO-LM

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for...

39
Emerging
1253 Sea-Snell/JAXSeq

Train very large language models in Jax.

39
Emerging
1254 virevolai/logos-shift-client

Replace expensive LLM calls with finetunes automatically

39
Emerging
1255 Geotrend-research/smaller-transformers

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

39
Emerging
1256 hpcaitech/SwiftInfer

Efficient AI Inference & Serving

39
Emerging
1257 NimbleEdge/sparse_transformers

Sparse Inferencing for transformer based LLMs

39
Emerging
1258 MrYxJ/calculate-flops.pytorch

The calflops is designed to calculate FLOPs、MACs and Parameters in all...

39
Emerging
1259 clovaai/length-adaptive-transformer

Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

39
Emerging
1260 Mj23978/sam-assistant

🤖 Sam-assistant is a personal assistant that is designed to understand your...

39
Emerging
1261 cloudmercato/ollama-benchmark

Handy tool to measure the performance and efficiency of LLMs workloads.

39
Emerging
1262 NVlabs/Long-RL

Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)

39
Emerging
1263 zalkikar/mlm-bias

Measuring Biases in Masked Language Models for PyTorch Transformers. Support...

39
Emerging
1264 TIGER-AI-Lab/Vamba

Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid...

39
Emerging
1265 FareedKhan-dev/qwen3-MoE-from-scratch

A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch

39
Emerging
1266 wellcometrust/WellcomeML

Retired repository for Machine Learning utils at the Wellcome Trust (now deprecated).

39
Emerging
1267 kssteven418/BigLittleDecoder

[NeurIPS'23] Speculative Decoding with Big Little Decoder

39
Emerging
1268 robertvacareanu/llm4regression

Examining how large language models (LLMs) perform across various synthetic...

39
Emerging
1269 kyegomez/TeraGPT

Train a production grade GPT in less than 400 lines of code. Better than...

39
Emerging
1270 flowersteam/LLM-Culture

Code for the "Cultural evolution in populations of Large Language Models" paper

39
Emerging
1271 18907305772/FuseAI

FuseAI Project

39
Emerging
1272 sixfingerdev/sixfinger-api

SixFinger API - Free AI Chat Api - 10-20x Faster AI Chat API - Including 10 models.

39
Emerging
1273 knotgrass/attention

several types of attention modules written in PyTorch for learning purposes

39
Emerging
1274 akshitac8/OW-DETR

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

39
Emerging
1275 wpeebles/G.pt

Official PyTorch Implementation of "Learning to Learn with Generative Models...

39
Emerging
1276 datawhalechina/llm-deploy

大模型/LLM推理和部署理论与实践

39
Emerging
1277 rishub-tamirisa/tamper-resistance

[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for...

39
Emerging
1278 zhenyi4/codi

Official repository for "CODI: Compressing Chain-of-Thought into Continuous...

39
Emerging
1279 xiuqhou/Relation-DETR

[ECCV2024 Oral] Official implementation of the paper "Relation DETR:...

39
Emerging
1280 hitz-zentroa/GoLLIE

Guideline following Large Language Model for Information Extraction

39
Emerging
1281 lqzxt/Time-R1

Time-R1 is a two-stage reinforcement fine-tuning framework that trains large...

39
Emerging
1282 qizekun/ShapeLLM

[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

39
Emerging
1283 gtausa197-svg/-Project-Nord-Spiking-Neural-Network-Language-Model

The first pure SNN language model trained from scratch with a fully original...

39
Emerging
1284 SolomonB14D3/knowledge-fidelity

Behavioral auditing & repair toolkit for LLMs. Measures 8 dimensions via...

39
Emerging
1285 kehanlu/DeSTA2

Code and model for ICASSP 2025 Paper "Developing Instruction-Following...

39
Emerging
1286 rasbt/dora-from-scratch

LoRA and DoRA from Scratch Implementations

39
Emerging
1287 ramonclaudio/perplexity-ai-toolkit

A lightweight Python API wrapper and CLI for Perplexity’s Sonar language models.

39
Emerging
1288 shm007g/LLaMA-Cult-and-More

Large Language Models for All, 🦙 Cult and More, Stay in touch !

39
Emerging
1289 EricLBuehler/xlora

X-LoRA: Mixture of LoRA Experts

39
Emerging
1290 sugarme/transformer

NLP transformers written in Go

39
Emerging
1291 pranavkumaarofficial/nlcli-wizard

Natural language control for Python CLI tools using locally-trained SLMs...

39
Emerging
1292 g8a9/ferret

A python package for benchmarking interpretability techniques on Transformers.

39
Emerging
1293 belladoreai/llama-tokenizer-js

JS tokenizer for LLaMA 1 and 2

39
Emerging
1294 pogzyb/tourist

Open-source, LLM-ready SERP and web scraping service

39
Emerging
1295 kastalimohammed1965/CLIP-fine-tune-registers-gated

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...

39
Emerging
1296 GAIR-NLP/OctoThinker

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

39
Emerging
1297 neuralwork/instruct-finetune-mistral

Fine-tune Mistral 7B to generate fashion style suggestions

39
Emerging
1298 czg1225/dParallel

[ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs

39
Emerging
1299 THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

39
Emerging
1300 Esmail-ibraheem/nanograd

nanograd🧠 ML/DL and neural net ecosystem, run models like GPT, llama, stable...

39
Emerging
« Prev 1 2 3 11 12 13 14 15 63 64 65 Next »