All Transformer Models
6,429 models ranked by quality score · Page 19 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 1801 |
linonetwo/langchain-alpaca
Run Alpaca LLM in LangChain |
|
Emerging |
| 1802 |
ausboss/Local-LLM-Langchain
Load local LLMs effortlessly in a Jupyter notebook for testing purposes... |
|
Emerging |
| 1803 |
nawnoes/pytorch-gpt-x
An implementation of an autoregressive language model using an improved... |
|
Emerging |
| 1804 |
ksm26/Open-Source-Models-with-Hugging-Face
"Open Source Models with Hugging Face" course empowers you with the skills... |
|
Emerging |
| 1805 |
nanowell/Differential-Transformer-PyTorch
PyTorch implementation of the Differential-Transformer architecture for... |
|
Emerging |
| 1806 |
Hon-Wong/VoRA
[Fully open] [Encoder-free MLLM] Vision as LoRA |
|
Emerging |
| 1807 |
VikingOwl91/vessel
A lightweight, local-first web UI for managing Ollama models. |
|
Emerging |
| 1808 |
Gen-Verse/ReasonFlux
[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux,... |
|
Emerging |
| 1809 |
PKU-Alignment/beavertails
BeaverTails is a collection of datasets designed to facilitate research on... |
|
Emerging |
| 1810 |
wang2226/Awesome-LLM-Decoding
📜 Paper list on decoding methods for LLMs and LVLMs |
|
Emerging |
| 1811 |
dougeeai/llama-cpp-python-wheels
Pre-built wheels for llama-cpp-python across platforms and CUDA versions |
|
Emerging |
| 1812 |
HaoAreYuDong/MachineLearningLM
Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML |
|
Emerging |
| 1813 |
Pengxin-Guo/FedSA-LoRA
Selective Aggregation for Low-Rank Adaptation in Federated Learning [ICLR 2025] |
|
Emerging |
| 1814 |
neosantara-xyz/glm-ocr-inference
Fast and lightweight GLM-OCR inference on Modal with an OpenAI-compatible... |
|
Emerging |
| 1815 |
asigalov61/Giant-Music-Transformer
[SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with... |
|
Emerging |
| 1816 |
mtuann/llm-updated-papers
Papers related to Large Language Models in all top venues |
|
Emerging |
| 1817 |
Active-Matrix/proximity
Proximity is an AI-powered news aggregator and TL;DR summarizer with a... |
|
Emerging |
| 1818 |
kkahatapitiya/LangRepo
Code for our ACL 2025 paper "Language Repository for Long Video Understanding" |
|
Emerging |
| 1819 |
gitctrlx/llama.cu
Llama from scratch in CUDA with Flash Attention. |
|
Emerging |
| 1820 |
pymc-labs/transpailer
LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan. |
|
Emerging |
| 1821 |
VectorInstitute/atomgen
Library for handling atomistic graph datasets focusing on transformer-based... |
|
Emerging |
| 1822 |
liaoyuhua/LLM4TS
Large Language & Foundation Models for Time Series. |
|
Emerging |
| 1823 |
liuqidong07/LEADER-pytorch
[arXiv'24] The official implementation code of LEADER. |
|
Emerging |
| 1824 |
DunnBC22/Vision_Audio_and_Multimodal_Projects
This repository includes all computer vision, audio, document AI, and... |
|
Emerging |
| 1825 |
antoninodimaggio/Hugging-Captions
Generate realistic Instagram captions using transformers 🤗 |
|
Emerging |
| 1826 |
louisbrulenaudet/tsdae
Transformer-based Denoising AutoEncoder for Sentence Transformers... |
|
Emerging |
| 1827 |
hesamsheikh/llm-mechanics
Coding an LLM and its building blocks from scratch. |
|
Emerging |
| 1828 |
chelsea0x3b/llama-dfdx
LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed! |
|
Emerging |
| 1829 |
rezazad68/transdeeplab
TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical... |
|
Emerging |
| 1830 |
czg1225/CoDe
[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive... |
|
Emerging |
| 1831 |
WayneMao/RoboMatrix
The Official Implementation of RoboMatrix |
|
Emerging |
| 1832 |
IDSIA/modern-srwm
Official repository for the paper "A Modern Self-Referential Weight Matrix... |
|
Emerging |
| 1833 |
StargazerX0/ScaleKV
[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with... |
|
Emerging |
| 1834 |
ModelTC/QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate... |
|
Emerging |
| 1835 |
Beomi/BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of... |
|
Emerging |
| 1836 |
SqueezeAILab/KVQuant
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with... |
|
Emerging |
| 1837 |
mechramc/Orion
Local AI runtime for training & running small LLMs directly on Apple Neural... |
|
Emerging |
| 1838 |
CEC-Agent/CEC
Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for... |
|
Emerging |
| 1839 |
theodo-group/GenossGPT
One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT... |
|
Emerging |
| 1840 |
SakanaAI/evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain... |
|
Emerging |
| 1841 |
AkiRusProd/numpy-transformer
A numpy implementation of the Transformer model in "Attention is All You Need" |
|
Emerging |
| 1842 |
iKernels/transformers-lightning
A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses... |
|
Emerging |
| 1843 |
ARUNAGIRINATHAN-K/huggingface-on-aws
Deploy and train Hugging Face models on AWS — SageMaker, Bedrock, ECS, EKS, and more. |
|
Emerging |
| 1844 |
will-thompson-k/tldr-transformers
The "tl;dr" on a few notable transformer papers (pre-2022). |
|
Emerging |
| 1845 |
dsindex/iclassifier
reference pytorch code for intent classification |
|
Emerging |
| 1846 |
holarissun/RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models... |
|
Emerging |
| 1847 |
Nikityyy/lille
A powerful 130-million-parameter model trained from scratch as part of a... |
|
Emerging |
| 1848 |
nlpkeg/Know-MRI
This is an official code for the [ACL 2025 Demo] paper: Know-MRI: A... |
|
Emerging |
| 1849 |
cxcscmu/Montessori-Instruct
Official repository for Montessori-Instruct: Generate Influential Training... |
|
Emerging |
| 1850 |
Infini-AI-Lab/Sequoia
scalable and robust tree-based speculative decoding algorithm |
|
Emerging |
| 1851 |
VITA-Group/Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via... |
|
Emerging |
| 1852 |
sunnynguyen-ai/llm-attention-visualizer
Interactive tool for analyzing attention patterns in transformer models with... |
|
Emerging |
| 1853 |
abacaj/transformers-docker
Run, build, test transformer models using docker |
|
Emerging |
| 1854 |
MNoorFawi/curlora
The code repository for the CURLoRA research paper. Stable LLM continual... |
|
Emerging |
| 1855 |
ZJLAB-AMMI/LLM4Teach
Python code to implement LLM4Teach, a policy distillation approach for... |
|
Emerging |
| 1856 |
jesus3476/Fire-Detection-Siglip2
Fire-Detection-Siglip2 is an image classification vision-language encoder... |
|
Emerging |
| 1857 |
haoliuhl/instructrl
Instruction Following Agents with Multimodal Transforemrs |
|
Emerging |
| 1858 |
monk1337/auto-ollama
run ollama & gguf easily with a single command |
|
Emerging |
| 1859 |
AntonioGr7/pratical-llms
A collection of hand on notebook for LLMs practitioner |
|
Emerging |
| 1860 |
pymc-labs/transalchemy
LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan. |
|
Emerging |
| 1861 |
starmpcc/CAMEL
Clinically Adapted Model Enhanced from LLaMA |
|
Emerging |
| 1862 |
RahulSChand/gpu_poor
Calculate token/s & GPU memory requirement for any LLM. Supports... |
|
Emerging |
| 1863 |
snktshrma/ngps_flight
Global vision positioning system for UAVs in outdoor GNSS-denied environments |
|
Emerging |
| 1864 |
yang-ai-lab/OSF-Open-Sleep-FM
OSF: On Pre-training and Scaling of Sleep Foundation Models |
|
Emerging |
| 1865 |
IvanBongiorni/maximal
A TensorFlow-compatible Python library that provides models and layers to... |
|
Emerging |
| 1866 |
Tebmer/Awesome-Knowledge-Distillation-of-LLMs
This repository collects papers for "A Survey on Knowledge Distillation of... |
|
Emerging |
| 1867 |
OneInterface/realtime-bakllava
llama.cpp with BakLLaVA model describes what does it see |
|
Emerging |
| 1868 |
moritztng/fltr
Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B. |
|
Emerging |
| 1869 |
fermyon/ai-examples
A collection of serverless apps that show how Fermyon's Serverless AI... |
|
Emerging |
| 1870 |
hpdps-group/ElasticMM
ElasticMM: Elastic and Efficient MLLM Serving System |
|
Emerging |
| 1871 |
ASSERT-KTH/repairllama
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program... |
|
Emerging |
| 1872 |
5aharsh/collama
Run Ollama LLM models in Google Colab for free |
|
Emerging |
| 1873 |
liupras/Practical-local-LLM-programming
Programming with local large language model. |
|
Emerging |
| 1874 |
haesleinhuepf/human-eval-bia
Benchmarking Large Language Models for Bio-Image Analysis Code Generation |
|
Emerging |
| 1875 |
david0154/david-ai
🤖 D.A.V.I.D AI - Advanced AI assistant with voice control, gesture... |
|
Emerging |
| 1876 |
zerovl/ZeroVL
[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources |
|
Emerging |
| 1877 |
CLAIRE-Labo/quantile-reward-policy-optimization
Official codebase for "Quantile Reward Policy Optimization: Alignment with... |
|
Emerging |
| 1878 |
eliahuhorwitz/Spectral-DeTuning
Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning... |
|
Emerging |
| 1879 |
rasbt/gradient-accumulation-blog
Finetuning BLOOM on a single GPU using gradient-accumulation |
|
Emerging |
| 1880 |
ictnlp/TruthX
Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large... |
|
Emerging |
| 1881 |
eugenehp/bitnet-cpp-rs
Rust bindings for bitnet.cpp based on llama-cpp-4 |
|
Emerging |
| 1882 |
LucknowAI/Lucknow-LLM
Collecting data for Building Lucknow's first LLM |
|
Emerging |
| 1883 |
steinbergmedia/libmusictok
C++ Library for tokenizing MIDI files, designed to be compatible with the... |
|
Emerging |
| 1884 |
davide-coccomini/MINTIME-Multi-Identity-size-iNvariant-TIMEsformer-for-Video-Deepfake-Detection
Code for Video Deepfake Detector from "MINTIME: Multi-Identity... |
|
Emerging |
| 1885 |
kyegomez/CNNGPT
This CNN-based language model leverages causal and dilated convolutions,... |
|
Emerging |
| 1886 |
henrikalbihn/gliner-as-a-service
GLiNER model in a FastAPI microservice. |
|
Emerging |
| 1887 |
fboulnois/llm-leaderboard-csv
CSVs of the Huggingface and LMArena LLM leaderboards, along with the code to... |
|
Emerging |
| 1888 |
TencentARC/ST-LLM
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language... |
|
Emerging |
| 1889 |
zhenyi4/ssa
Official repository for "SSA: Sparse Sparse Attention by Aligning Full and... |
|
Emerging |
| 1890 |
ykjaat6104/LLM-Cost-and-Token-Efficiency-Analysis
A benchmark study analyzing cost and token efficiency across 14 LLMs from 5... |
|
Emerging |
| 1891 |
TrevTron/indiedroid-nova-llm
Running Llama 3.1 8B and other LLMs on RK3588 NPU - benchmarks and setup guides |
|
Emerging |
| 1892 |
thruthseeker/LionLock_FDE_OSS
Open source fatigue detection engine for large language models with trust overlay |
|
Emerging |
| 1893 |
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO. |
|
Emerging |
| 1894 |
kyegomez/MGQA
The open source implementation of the multi grouped query attention by the... |
|
Emerging |
| 1895 |
itsqyh/Awesome-LMMs-Mechanistic-Interpretability
A curated collection of resources focused on the Mechanistic... |
|
Emerging |
| 1896 |
ashioyajotham/fingpt_trader
A quant trading system platform based on FinGPT, demonstrating new... |
|
Emerging |
| 1897 |
tsinghua-fib-lab/ANeurIPS2024_SPV-MIA
[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language... |
|
Emerging |
| 1898 |
InternLM/OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning |
|
Emerging |
| 1899 |
TIGER-AI-Lab/General-Reasoner
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25] |
|
Emerging |
| 1900 |
jhcho99/GSRTR
[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition... |
|
Emerging |