All Transformer Models
6,429 models ranked by quality score · Page 18 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 1701 |
ShelbyJenkins/llm_utils
llm_utils: Basic LLM tools, best practices, and minimal abstraction. |
|
Emerging |
| 1702 |
senadkurtisi/pytorch-image-captioning
Transformer & CNN Image Captioning model in PyTorch. |
|
Emerging |
| 1703 |
jackaduma/Alpaca-LoRA-RLHF-PyTorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer... |
|
Emerging |
| 1704 |
uncbiag/Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks |
|
Emerging |
| 1705 |
zjunlp/LightThinker
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression |
|
Emerging |
| 1706 |
leftmove/cria
Run LLMs locally with as little friction as possible. |
|
Emerging |
| 1707 |
nikolaydubina/llama2.go
LLaMA-2 in native Go |
|
Emerging |
| 1708 |
hoof-ai/hoof
"Just hoof it!" - A spotlight like interface to Ollama |
|
Emerging |
| 1709 |
PCfVW/hf-fetch-model
Fast HuggingFace model downloads for Rust — an embeddable library for... |
|
Emerging |
| 1710 |
ronniross/attention-heatmap-visualizer
A set of scripts to generate full attention-head heatmaps for transformer-based LLMs |
|
Emerging |
| 1711 |
hitz-zentroa/whisper-lm
Add n-gram and large language model (LLM) support to Whisper models. |
|
Emerging |
| 1712 |
OatmealLiu/FineR
[ICLR'24] Democratizing Fine-grained Visual Recognition with Large Language Models |
|
Emerging |
| 1713 |
saddam213/LLamaStack
ASP.NET Core Web, WebApi & WPF implementations for LLama.cpp & LLamaSharp |
|
Emerging |
| 1714 |
BUAADreamer/SPN4CIR
[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning... |
|
Emerging |
| 1715 |
or4cl3-ai-1/ethereal-insights
Quantum-Enhanced Paranormal Investigation Platform — benchmarks sensor... |
|
Emerging |
| 1716 |
AI4LIFE-GROUP/LLM_Explainer
Code for paper: Are Large Language Models Post Hoc Explainers? |
|
Emerging |
| 1717 |
omron-sinicx/crystalframer
The official code respository for "Rethinking the role of frames for... |
|
Emerging |
| 1718 |
erevusobolus/THERION-SYSTEM
🦁 THERION — Your AI. Your Hardware. Your Rules. Complete local AI assistant... |
|
Emerging |
| 1719 |
wangcongcong123/ttt
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+ |
|
Emerging |
| 1720 |
GeeeekExplorer/transformers-patch
patches for huggingface transformers to save memory |
|
Emerging |
| 1721 |
OpenNLPLab/TransnormerLLM
Official implementation of TransNormerLLM: A Faster and Better LLM |
|
Emerging |
| 1722 |
OpenBMB/VisCPM
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat... |
|
Emerging |
| 1723 |
yashbonde/rasp
Implementing RASP transformer programming language... |
|
Emerging |
| 1724 |
yzGuu830/efficient-speech-codec
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector... |
|
Emerging |
| 1725 |
teelinsan/parallel-decoding
Repository of the paper "Accelerating Transformer Inference for Translation... |
|
Emerging |
| 1726 |
ExplainableML/Vision_by_Language
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free... |
|
Emerging |
| 1727 |
Jathurshan0330/Cross-Modal-Transformer
Official repository of cross-modal transformer for interpretable automatic... |
|
Emerging |
| 1728 |
touhi99/askagent
Simple mac/unix terminal assistant with LLM agents capable of various tasks |
|
Emerging |
| 1729 |
systems-genomics-lab/deeptaxa
A deep learning framework for hierarchical taxonomy classification of 16S... |
|
Emerging |
| 1730 |
ziqipang/RandAR
[CVPR 2025 (Oral)] Open implementation of "RandAR" |
|
Emerging |
| 1731 |
JayZhang42/SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large... |
|
Emerging |
| 1732 |
Lanerra/reasoning-bank-slm
An experiment that applies Google Research's `ReasoningBank` technique to... |
|
Emerging |
| 1733 |
YassWorks/Tuna
Python library that makes fine-tuning transformer-based models easier and faster. |
|
Emerging |
| 1734 |
sandseb123/local-lora-cookbook
Fine-tune a local LLM on your own app's data in 15 minutes. Runs entirely... |
|
Emerging |
| 1735 |
iamgmujtaba/llama3.2-webUI
LLaMa 3.2 Multimodal Web UI is a user-friendly interface for interacting... |
|
Emerging |
| 1736 |
its-kumar-yash/deep-study-ai-agent
DeepStudy AI automates research, refines queries dynamically, and generates... |
|
Emerging |
| 1737 |
Azure99/BlossomData
A fluent, scalable, and easy-to-use LLM data processing framework. |
|
Emerging |
| 1738 |
Cryolite/kanachan
A Japanese (Riichi) Mahjong AI Framework |
|
Emerging |
| 1739 |
LISA-ITMO/LLM-resume-moderator
Автоматизирует модерацию резюме на русском языке с помощью LLM. Для... |
|
Emerging |
| 1740 |
AIFEG/BenchLMM
[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large... |
|
Emerging |
| 1741 |
vbdi/divprune
[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large... |
|
Emerging |
| 1742 |
devdhananjay14/multim
🔍 Experiment with neural networks for binary classification on multimodal... |
|
Emerging |
| 1743 |
Wang-ML-Lab/llm-continual-learning-survey
[CSUR 2025] Continual Learning of Large Language Models: A Comprehensive Survey |
|
Emerging |
| 1744 |
deep-diver/segformer-tf-transformers
This repository demonstrates how to use TensorFlow based SegFormer model in... |
|
Emerging |
| 1745 |
HUBioDataLab/SELFormer
SELFormer: Molecular Representation Learning via SELFIES Language Models |
|
Emerging |
| 1746 |
bminixhofer/tokenkit
A toolkit implementing advanced methods to transfer models and model... |
|
Emerging |
| 1747 |
ejaz57/localchat
🌐 Build a private web interface for local LLMs, ensuring complete privacy... |
|
Emerging |
| 1748 |
vicuna-tools/vicuna-installation-guide
The "vicuna-installation-guide" provides step-by-step instructions for... |
|
Emerging |
| 1749 |
kyegomez/Fusion3D
An extremely experimental model that intakes images and generates 3D scenes... |
|
Emerging |
| 1750 |
automorphic-ai/trex
Enforce structured output from LLMs 100% of the time |
|
Emerging |
| 1751 |
umbertocappellazzo/Llama-AVSR
Official Pytorch implementation of "Large Language Models are Strong... |
|
Emerging |
| 1752 |
hitz-zentroa/whisper-lm-transformers
Add n-gram and LLM language model support to HF Transformers Whisper models. |
|
Emerging |
| 1753 |
thongnt99/learned-sparse-retrieval
Unified Learned Sparse Retrieval Framework |
|
Emerging |
| 1754 |
ValentinOliveira/ai-recruitment-assistant
🤖 Automate recruitment communication with our AI-powered assistant,... |
|
Emerging |
| 1755 |
NVlabs/NFT
Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging... |
|
Emerging |
| 1756 |
maxxxzdn/erwin
Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical... |
|
Emerging |
| 1757 |
calpt/awesome-adapter-resources
Collection of Tools and Papers related to Adapters / Parameter-Efficient... |
|
Emerging |
| 1758 |
diogok/llama.cpp.zig
A build.zig for llama.cpp |
|
Emerging |
| 1759 |
arshadshk/SAINT-pytorch
SAINT PyTorch implementation |
|
Emerging |
| 1760 |
vipulraheja/iterater
Official implementation of the paper "IteraTeR: Understanding Iterative... |
|
Emerging |
| 1761 |
AdrianBZG/LLM-distributed-finetune
Tune efficiently any LLM model from HuggingFace using distributed training... |
|
Emerging |
| 1762 |
adarshM84/TextLLaMACode
Transform your writing with TextLLaMA! ✍️🚀 Simplify grammar, translate... |
|
Emerging |
| 1763 |
dmis-lab/Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers |
|
Emerging |
| 1764 |
akjindal53244/Arithmo
Small and Efficient Mathematical Reasoning LLMs |
|
Emerging |
| 1765 |
MURUGESAN88709/mental-health-finetuned-llama
🧠 Fine-tune LLaMA for mental health applications, providing insights and... |
|
Emerging |
| 1766 |
developer239/llama.cpp-ts
llama.cpp 🦙 LLM inference in TypeScript |
|
Emerging |
| 1767 |
vlarine/transformers-ru
A list of pretrained Transformer models for the Russian language. |
|
Emerging |
| 1768 |
THUDM/Multilingual-GLM
The multilingual variant of GLM, a general language model trained with... |
|
Emerging |
| 1769 |
xmindflow/MSA-2Net
[BMVC 2024] Official repository of the paper titled "MSA^2 Net: Multi-scale... |
|
Emerging |
| 1770 |
woodRock/fishy-business
Machine Learning for Rapid Evaporative Ionization Mass Spectrometry for... |
|
Emerging |
| 1771 |
yyDing1/ScaleQuest
[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective... |
|
Emerging |
| 1772 |
pdfosborne/elsciRL
The core repository of the elsciRL framework. |
|
Emerging |
| 1773 |
gustavecortal/gpt-j-fine-tuning-example
Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression |
|
Emerging |
| 1774 |
aj-naik/Text-Summarization
Abstractive and Extractive Text summarization using Transformers. |
|
Emerging |
| 1775 |
Wangbiao2/R1-Track
R1-Track: Direct Application of MLLMs to Visual Object Tracking via... |
|
Emerging |
| 1776 |
zhchen18/ToMBench
ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024. |
|
Emerging |
| 1777 |
BatsResearch/planetarium
Dataset and benchmark for assessing LLMs in translating natural language... |
|
Emerging |
| 1778 |
otto-de/TRON
⚡️ Implementation of TRON: Transformer Recommender using Optimized... |
|
Emerging |
| 1779 |
BaohaoLiao/RSD
[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and... |
|
Emerging |
| 1780 |
declare-lab/CICERO
The purpose of this repository is to introduce new dialogue-level... |
|
Emerging |
| 1781 |
Bruce-Lee-LY/decoding_attention
Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using... |
|
Emerging |
| 1782 |
xf-zhao/LoT
Official implementation of LoT paper: "Enhancing Zero-Shot Chain-of-Thought... |
|
Emerging |
| 1783 |
alibaba/easydist
Automated Parallelization System and Infrastructure for Multiple Ecosystems |
|
Emerging |
| 1784 |
nsidn98/LLaMAR
Code for our paper LLaMAR: LM-based Long-Horizon Planner for Multi-Agent Robotics |
|
Emerging |
| 1785 |
Ereboas/MagiCodec
A single-layer, streaming codec model providing SOTA audio quality and... |
|
Emerging |
| 1786 |
daviden1013/llm-ie
A comprehensive toolkit that provides building blocks for LLM-based named... |
|
Emerging |
| 1787 |
lucasjinreal/Namo-R1
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from... |
|
Emerging |
| 1788 |
nlpodyssey/gotokenizers
Go implementation of today's most used tokenizers |
|
Emerging |
| 1789 |
palonso/MAEST
Pre-training, fine-tuning, and inference code with the MAEST models for... |
|
Emerging |
| 1790 |
loong64/ollama
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other... |
|
Emerging |
| 1791 |
sail-sg/Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical... |
|
Emerging |
| 1792 |
ExplainableML/WaffleCLIP
Official repository for the ICCV 2023 paper: "Waffling around for... |
|
Emerging |
| 1793 |
KolosalAI/kolosal-server
Kolosal AI is an OpenSource and Lightweight alternative to Ollama to run... |
|
Emerging |
| 1794 |
uakarsh/latr
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel... |
|
Emerging |
| 1795 |
mybigday/llama.node
Node.js binding of llama.cpp |
|
Emerging |
| 1796 |
Sakeeb91/text2sql-agent
Self-correcting AI agent for natural language to SQL using HuggingFace... |
|
Emerging |
| 1797 |
DAMO-NLP-SG/multilingual-safety-for-LLMs
[ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models" |
|
Emerging |
| 1798 |
BubbleJoe-BrownU/TransformerHub
This is a repository of transformer-like models, including Transformer, GPT,... |
|
Emerging |
| 1799 |
arcee-ai/PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large... |
|
Emerging |
| 1800 |
PathologyFoundation/plip
Pathology Language and Image Pre-Training (PLIP) is the first vision and... |
|
Emerging |