All Transformer Models

6,429 models ranked by quality score · Page 19 of 65

Showing 1801–1900 of 6,429
# Model Score Tier
1801 linonetwo/langchain-alpaca

Run Alpaca LLM in LangChain

34
Emerging
1802 ausboss/Local-LLM-Langchain

Load local LLMs effortlessly in a Jupyter notebook for testing purposes...

34
Emerging
1803 nawnoes/pytorch-gpt-x

An implementation of an autoregressive language model using an improved...

34
Emerging
1804 ksm26/Open-Source-Models-with-Hugging-Face

"Open Source Models with Hugging Face" course empowers you with the skills...

34
Emerging
1805 nanowell/Differential-Transformer-PyTorch

PyTorch implementation of the Differential-Transformer architecture for...

34
Emerging
1806 Hon-Wong/VoRA

[Fully open] [Encoder-free MLLM] Vision as LoRA

34
Emerging
1807 VikingOwl91/vessel

A lightweight, local-first web UI for managing Ollama models.

34
Emerging
1808 Gen-Verse/ReasonFlux

[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux,...

34
Emerging
1809 PKU-Alignment/beavertails

BeaverTails is a collection of datasets designed to facilitate research on...

34
Emerging
1810 wang2226/Awesome-LLM-Decoding

📜 Paper list on decoding methods for LLMs and LVLMs

34
Emerging
1811 dougeeai/llama-cpp-python-wheels

Pre-built wheels for llama-cpp-python across platforms and CUDA versions

34
Emerging
1812 HaoAreYuDong/MachineLearningLM

Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML

34
Emerging
1813 Pengxin-Guo/FedSA-LoRA

Selective Aggregation for Low-Rank Adaptation in Federated Learning [ICLR 2025]

34
Emerging
1814 neosantara-xyz/glm-ocr-inference

Fast and lightweight GLM-OCR inference on Modal with an OpenAI-compatible...

34
Emerging
1815 asigalov61/Giant-Music-Transformer

[SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with...

34
Emerging
1816 mtuann/llm-updated-papers

Papers related to Large Language Models in all top venues

34
Emerging
1817 Active-Matrix/proximity

Proximity is an AI-powered news aggregator and TL;DR summarizer with a...

34
Emerging
1818 kkahatapitiya/LangRepo

Code for our ACL 2025 paper "Language Repository for Long Video Understanding"

34
Emerging
1819 gitctrlx/llama.cu

Llama from scratch in CUDA with Flash Attention.

34
Emerging
1820 pymc-labs/transpailer

LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan.

34
Emerging
1821 VectorInstitute/atomgen

Library for handling atomistic graph datasets focusing on transformer-based...

34
Emerging
1822 liaoyuhua/LLM4TS

Large Language & Foundation Models for Time Series.

34
Emerging
1823 liuqidong07/LEADER-pytorch

[arXiv'24] The official implementation code of LEADER.

34
Emerging
1824 DunnBC22/Vision_Audio_and_Multimodal_Projects

This repository includes all computer vision, audio, document AI, and...

34
Emerging
1825 antoninodimaggio/Hugging-Captions

Generate realistic Instagram captions using transformers 🤗

34
Emerging
1826 louisbrulenaudet/tsdae

Transformer-based Denoising AutoEncoder for Sentence Transformers...

34
Emerging
1827 hesamsheikh/llm-mechanics

Coding an LLM and its building blocks from scratch.

34
Emerging
1828 chelsea0x3b/llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

34
Emerging
1829 rezazad68/transdeeplab

TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical...

34
Emerging
1830 czg1225/CoDe

[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive...

34
Emerging
1831 WayneMao/RoboMatrix

The Official Implementation of RoboMatrix

34
Emerging
1832 IDSIA/modern-srwm

Official repository for the paper "A Modern Self-Referential Weight Matrix...

34
Emerging
1833 StargazerX0/ScaleKV

[NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with...

34
Emerging
1834 ModelTC/QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate...

34
Emerging
1835 Beomi/BitNet-Transformers

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of...

34
Emerging
1836 SqueezeAILab/KVQuant

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with...

34
Emerging
1837 mechramc/Orion

Local AI runtime for training & running small LLMs directly on Apple Neural...

34
Emerging
1838 CEC-Agent/CEC

Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for...

34
Emerging
1839 theodo-group/GenossGPT

One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT...

34
Emerging
1840 SakanaAI/evo-memory

Code to train and evaluate Neural Attention Memory Models to obtain...

34
Emerging
1841 AkiRusProd/numpy-transformer

A numpy implementation of the Transformer model in "Attention is All You Need"

34
Emerging
1842 iKernels/transformers-lightning

A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses...

34
Emerging
1843 ARUNAGIRINATHAN-K/huggingface-on-aws

Deploy and train Hugging Face models on AWS — SageMaker, Bedrock, ECS, EKS, and more.

34
Emerging
1844 will-thompson-k/tldr-transformers

The "tl;dr" on a few notable transformer papers (pre-2022).

34
Emerging
1845 dsindex/iclassifier

reference pytorch code for intent classification

34
Emerging
1846 holarissun/RewardModelingBeyondBradleyTerry

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models...

34
Emerging
1847 Nikityyy/lille

A powerful 130-million-parameter model trained from scratch as part of a...

34
Emerging
1848 nlpkeg/Know-MRI

This is an official code for the [ACL 2025 Demo] paper: Know-MRI: A...

34
Emerging
1849 cxcscmu/Montessori-Instruct

Official repository for Montessori-Instruct: Generate Influential Training...

34
Emerging
1850 Infini-AI-Lab/Sequoia

scalable and robust tree-based speculative decoding algorithm

34
Emerging
1851 VITA-Group/Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via...

34
Emerging
1852 sunnynguyen-ai/llm-attention-visualizer

Interactive tool for analyzing attention patterns in transformer models with...

34
Emerging
1853 abacaj/transformers-docker

Run, build, test transformer models using docker

34
Emerging
1854 MNoorFawi/curlora

The code repository for the CURLoRA research paper. Stable LLM continual...

34
Emerging
1855 ZJLAB-AMMI/LLM4Teach

Python code to implement LLM4Teach, a policy distillation approach for...

34
Emerging
1856 jesus3476/Fire-Detection-Siglip2

Fire-Detection-Siglip2 is an image classification vision-language encoder...

34
Emerging
1857 haoliuhl/instructrl

Instruction Following Agents with Multimodal Transforemrs

34
Emerging
1858 monk1337/auto-ollama

run ollama & gguf easily with a single command

34
Emerging
1859 AntonioGr7/pratical-llms

A collection of hand on notebook for LLMs practitioner

34
Emerging
1860 pymc-labs/transalchemy

LLM-based, self-correcting transpiler. Supports JAX, PyTorch, Rust, PyMC, Stan.

34
Emerging
1861 starmpcc/CAMEL

Clinically Adapted Model Enhanced from LLaMA

34
Emerging
1862 RahulSChand/gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports...

34
Emerging
1863 snktshrma/ngps_flight

Global vision positioning system for UAVs in outdoor GNSS-denied environments

34
Emerging
1864 yang-ai-lab/OSF-Open-Sleep-FM

OSF: On Pre-training and Scaling of Sleep Foundation Models

34
Emerging
1865 IvanBongiorni/maximal

A TensorFlow-compatible Python library that provides models and layers to...

34
Emerging
1866 Tebmer/Awesome-Knowledge-Distillation-of-LLMs

This repository collects papers for "A Survey on Knowledge Distillation of...

34
Emerging
1867 OneInterface/realtime-bakllava

llama.cpp with BakLLaVA model describes what does it see

34
Emerging
1868 moritztng/fltr

Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.

34
Emerging
1869 fermyon/ai-examples

A collection of serverless apps that show how Fermyon's Serverless AI...

34
Emerging
1870 hpdps-group/ElasticMM

ElasticMM: Elastic and Efficient MLLM Serving System

34
Emerging
1871 ASSERT-KTH/repairllama

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program...

34
Emerging
1872 5aharsh/collama

Run Ollama LLM models in Google Colab for free

34
Emerging
1873 liupras/Practical-local-LLM-programming

Programming with local large language model.

34
Emerging
1874 haesleinhuepf/human-eval-bia

Benchmarking Large Language Models for Bio-Image Analysis Code Generation

34
Emerging
1875 david0154/david-ai

🤖 D.A.V.I.D AI - Advanced AI assistant with voice control, gesture...

34
Emerging
1876 zerovl/ZeroVL

[ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources

34
Emerging
1877 CLAIRE-Labo/quantile-reward-policy-optimization

Official codebase for "Quantile Reward Policy Optimization: Alignment with...

34
Emerging
1878 eliahuhorwitz/Spectral-DeTuning

Official PyTorch Implementation for the "Recovering the Pre-Fine-Tuning...

34
Emerging
1879 rasbt/gradient-accumulation-blog

Finetuning BLOOM on a single GPU using gradient-accumulation

34
Emerging
1880 ictnlp/TruthX

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large...

34
Emerging
1881 eugenehp/bitnet-cpp-rs

Rust bindings for bitnet.cpp based on llama-cpp-4

34
Emerging
1882 LucknowAI/Lucknow-LLM

Collecting data for Building Lucknow's first LLM

34
Emerging
1883 steinbergmedia/libmusictok

C++ Library for tokenizing MIDI files, designed to be compatible with the...

34
Emerging
1884 davide-coccomini/MINTIME-Multi-Identity-size-iNvariant-TIMEsformer-for-Video-Deepfake-Detection

Code for Video Deepfake Detector from "MINTIME: Multi-Identity...

34
Emerging
1885 kyegomez/CNNGPT

This CNN-based language model leverages causal and dilated convolutions,...

34
Emerging
1886 henrikalbihn/gliner-as-a-service

GLiNER model in a FastAPI microservice.

34
Emerging
1887 fboulnois/llm-leaderboard-csv

CSVs of the Huggingface and LMArena LLM leaderboards, along with the code to...

34
Emerging
1888 TencentARC/ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language...

34
Emerging
1889 zhenyi4/ssa

Official repository for "SSA: Sparse Sparse Attention by Aligning Full and...

34
Emerging
1890 ykjaat6104/LLM-Cost-and-Token-Efficiency-Analysis

A benchmark study analyzing cost and token efficiency across 14 LLMs from 5...

34
Emerging
1891 TrevTron/indiedroid-nova-llm

Running Llama 3.1 8B and other LLMs on RK3588 NPU - benchmarks and setup guides

34
Emerging
1892 thruthseeker/LionLock_FDE_OSS

Open source fatigue detection engine for large language models with trust overlay

34
Emerging
1893 RLHFlow/Online-RLHF

A recipe for online RLHF and online iterative DPO.

34
Emerging
1894 kyegomez/MGQA

The open source implementation of the multi grouped query attention by the...

34
Emerging
1895 itsqyh/Awesome-LMMs-Mechanistic-Interpretability

A curated collection of resources focused on the Mechanistic...

34
Emerging
1896 ashioyajotham/fingpt_trader

A quant trading system platform based on FinGPT, demonstrating new...

34
Emerging
1897 tsinghua-fib-lab/ANeurIPS2024_SPV-MIA

[NeurIPS'24] "Membership Inference Attacks against Fine-tuned Large Language...

34
Emerging
1898 InternLM/OREAL

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

34
Emerging
1899 TIGER-AI-Lab/General-Reasoner

General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]

34
Emerging
1900 jhcho99/GSRTR

[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition...

33
Emerging
« Prev 1 2 3 17 18 19 20 21 63 64 65 Next »