All Transformer Models

6,429 models ranked by quality score · Page 9 of 65

Showing 801–900 of 6,429
# Model Score Tier
801 yuchenlin/LLM-Blender

[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to...

44
Emerging
802 tommasomncttn/mergenetic

Flexible library for merging large language models (LLMs) via evolutionary...

44
Emerging
803 hiyouga/FastEdit

🩹Editing large language models within 10 seconds⚡

44
Emerging
804 monologg/transformers-android-demo

📲 Transformers android examples (Tensorflow Lite & Pytorch Mobile)

44
Emerging
805 1b5d/llm-api

Run any Large Language Model behind a unified API

44
Emerging
806 poloclub/llm-landscape

NeurIPS'24 - LLM Safety Landscape

44
Emerging
807 cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your...

44
Emerging
808 iaalm/llama-api-server

A OpenAI API compatible REST server for llama.

44
Emerging
809 gluonfield/enchanted

Enchanted is iOS and macOS app for chatting with private self hosted...

44
Emerging
810 srgtuszy/llama-cpp-swift

Swift bindings for llama-cpp library

44
Emerging
811 gitabtion/BertBasedCorrectionModels

PyTorch impelementations of BERT-based Spelling Error Correction Models. ...

44
Emerging
812 freshllms/freshqa

Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214)

44
Emerging
813 hao-ai-lab/Dynasor

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model...

44
Emerging
814 vijaydwivedi75/gnn-lspe

Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural...

44
Emerging
815 uclaml/SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

44
Emerging
816 monologg/KoBERT-KorQuAD

Korean MRC (KorQuAD) with KoBERT

44
Emerging
817 SqueezeAILab/LLMCompiler

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

44
Emerging
818 0hq/WebGPT

Run GPT model on the browser with WebGPU. An implementation of GPT inference...

44
Emerging
819 joyehuang/minimind-notes

🚀 [从零构建 LLM] 极简大模型训练原理与实践指南。包含 Transformer, Pretraining, SFT 核心代码与对照实验。 | A...

44
Emerging
820 LibreTranslate/Locomotive

Toolkit for training/converting LibreTranslate compatible language models 🚂

44
Emerging
821 FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

44
Emerging
822 RManLuo/reasoning-on-graphs

Official Implementation of ICLR 2024 paper: "Reasoning on Graphs: Faithful...

44
Emerging
823 anseryuer/Local_LLM_Deployment_Guide_Chinese

本地部署大语言模型的中文教学

44
Emerging
824 powerserve-project/PowerServe

High-speed and easy-use LLM serving framework for local deployment

44
Emerging
825 CASE-Lab-UMD/Unified-MoE-Compression

The official implementation of the paper "Towards Efficient Mixture of...

44
Emerging
826 SakanaAI/text-to-lora

Hypernetworks that adapt LLMs for specific benchmark tasks using only...

44
Emerging
827 AI-Hypercomputer/jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream)...

44
Emerging
828 Arunprakash-A/DL-Pytorch-Workshop

Develop DL models using Pytorch and Hugging Face

44
Emerging
829 boyiwei/alignment-attribution-code

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and...

44
Emerging
830 NX-AI/mlstm_kernels

Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.

44
Emerging
831 git-disl/Vaccine

This is the official code for the paper "Vaccine: Perturbation-aware...

44
Emerging
832 mdrokz/rust-llama.cpp

LLama.cpp rust bindings

44
Emerging
833 lxe/simple-llm-finetuner

Simple UI for LLM Model Finetuning

44
Emerging
834 salesforce/ETSformer

PyTorch code for ETSformer: Exponential Smoothing Transformers for...

44
Emerging
835 deep-symbolic-mathematics/TPSR

[NeurIPS 2023] This is the official code for the paper "TPSR:...

44
Emerging
836 SkalskiP/vlms-zero-to-hero

This series will take you on a journey from the fundamentals of NLP and...

44
Emerging
837 JetRunner/BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT...

43
Emerging
838 dipanjanS/adv_nlp_workshop_odsc_europe22

Extensive tutorials for the Advanced NLP Workshop in Open Data Science...

43
Emerging
839 sinanuozdemir/oreilly-pytorch-dl

Code for Deep Learning for Modern AI

43
Emerging
840 IST-DASLab/marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up...

43
Emerging
841 cdpierse/script_buddy_v2

Script Buddy v2 is a film script text generation tool built using film...

43
Emerging
842 snap-stanford/relgt

Relational Graph Transformer

43
Emerging
843 stay-leave/enhance_llm

大模型相关实践记录

43
Emerging
844 IlyaGusev/rulm

Language modeling and instruction tuning for Russian

43
Emerging
845 armbues/SiLLM

SiLLM simplifies the process of training and running Large Language Models...

43
Emerging
846 ShivamRajSharma/Transformer-Architectures-From-Scratch

Implementation of transformers based architecture in PyTorch.

43
Emerging
847 AviSoori1x/Tuning-the-Finetuning

Tuning the Finetuning: An exploration of achieving success with QLoRA

43
Emerging
848 jasonvanf/llama-trl

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

43
Emerging
849 sinanuozdemir/oreilly-huggingface-tour

A Crash Course in Hugging Face

43
Emerging
850 gohjiayi/suicidal-text-detection

Building a suicidal text detection model and mental health chatbot with deep...

43
Emerging
851 cambridgeltl/visual-med-alpaca

Visual Med-Alpaca is an open-source, multi-modal foundation model designed...

43
Emerging
852 turtlesoupy/this-word-does-not-exist

This Word Does Not Exist

43
Emerging
853 Tongjilibo/build_MiniLLM_from_scratch

从0到1构建一个MiniLLM (pretrain+sft+dpo实践中)

43
Emerging
854 analyticalrohit/llms-from-scratch

Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.

43
Emerging
855 laelhalawani/gguf_modeldb

A quick and optimized solution to manage llama based gguf quantized models,...

43
Emerging
856 bayesgroup/code_transformers

Empirical Study of Transformers for Source Code & A Simple Approach for...

43
Emerging
857 openjlc/riscv64-library

Some of the libraries (docs) on the RISCV64 architecture are easy for users...

43
Emerging
858 princeton-nlp/SimPO

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

43
Emerging
859 RManLuo/graph-constrained-reasoning

Official Implementation of ICML 2025 Paper: "Graph-constrained Reasoning:...

43
Emerging
860 ddzipp/AutoAudit

AutoAudit—— the LLM for Cyber Security 网络安全大语言模型

43
Emerging
861 lxuechen/private-transformers

A codebase that makes differentially private training of transformers easy.

43
Emerging
862 datastone-spirit/spirit-lora-trainer

Spirit Lora Trainer is a robust toolkit for training Flux1-LoRA models with...

43
Emerging
863 rohan-paul/LLM-FineTuning-Large-Language-Models

LLM (Large Language Model) FineTuning

43
Emerging
864 iPieter/RobBERT

A Dutch RoBERTa-based language model

43
Emerging
865 hoangsonww/Spot-the-Scam-AI-Job-Fraud

🎒 An AI/ML-powered, full-stack job-posting fraud copilot delivering...

43
Emerging
866 salesforce/CodeTF

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

43
Emerging
867 kyegomez/SparseAttention

Pytorch Implementation of the sparse attention from the paper: "Generating...

43
Emerging
868 gitabtion/SoftMaskedBert-PyTorch

🙈 An unofficial implementation of SoftMaskedBert based on huggingface/transformers.

43
Emerging
869 kevinMEH/keyscan

Keyscan: AI-powered API key scanner for GitHub Gists.

43
Emerging
870 Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs,...

43
Emerging
871 MagedSaeed/generate-sequences

A python package made to generate sequences (greedy and beam-search) from...

43
Emerging
872 eric-ai-lab/MiniGPT-5

Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language...

43
Emerging
873 Kartik-3004/SegFace

[AAAI 25] SegFace: Face Segmentation of Long-tail classes

43
Emerging
874 varunkumar-dev/TransformersDataAugmentation

Code associated with the "Data Augmentation using Pre-trained Transformer...

43
Emerging
875 gitkaz/mlx_gguf_server

This is a FastAPI based LLM server. Load multiple LLM models (MLX or...

43
Emerging
876 GAIR-NLP/MegaScience

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

43
Emerging
877 datamllab/LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

43
Emerging
878 magpie-align/magpie

[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs...

43
Emerging
879 AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow

Implementation for "Improving Language Understanding by Generative...

43
Emerging
880 alibaba/GraphTranslator

GraphTranslator:Aligning Graph Model to Large Language Model for Open-ended Tasks

43
Emerging
881 zeozeozeo/ellama

Friendly interface to chat with an Ollama instance.

43
Emerging
882 CodeWithKyrian/transformers-php

Transformers PHP is a toolkit for PHP developers to add machine learning...

43
Emerging
883 DC-research/TEMPO

The official code for "TEMPO: Prompt-based Generative Pre-trained...

43
Emerging
884 CLAIRE-Labo/EvoTune

Efficiently discovering algorithms via LLMs with evolutionary search and...

43
Emerging
885 deep-diver/llamaduo

[ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration...

43
Emerging
886 SamsungSAILMontreal/nino

Code for "Accelerating Training with Neuron Interaction and Nowcasting...

43
Emerging
887 DAGroup-PKU/MHLA

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head...

43
Emerging
888 ucbrise/graphtrans

Representing Long-Range Context for Graph Neural Networks with Global Attention

43
Emerging
889 Gleghorn-Lab/Protify

Low code molecular property prediction

43
Emerging
890 amirhossein-kz/HiFormer

HiFormer: Hierarchical Multi-scale Representations Using Transformers for...

43
Emerging
891 hao-ai-lab/JacobiForcing

Jacobi Forcing: Fast and Accurate Diffusion-style Decoding

43
Emerging
892 gupta-abhay/pytorch-vit

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

43
Emerging
893 aliemo/transfomers-silicon-research

Research and Materials on Hardware implementation of Transformer Model

43
Emerging
894 ml4fp/2025-lbnl

ML4FP 2025: notebooks used for the Machine Learning for Fundamental Physics...

43
Emerging
895 InternLM/CapRL

[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image...

43
Emerging
896 salcc/QuantumTransformers

Quantum Transformers for High Energy Physics Analysis at the Large Hadron Collider

43
Emerging
897 nerve-sparks/iris_android

IRIS is an android app for interfacing with GGUF / llama.cpp models locally.

43
Emerging
898 xlang-ai/Binder

[ICLR 2023] Code for the paper "Binding Language Models in Symbolic Languages"

43
Emerging
899 modelscope/dash-infer

DashInfer is a native LLM inference engine aiming to deliver...

43
Emerging
900 VikParuchuri/textbook_quality

Generate textbook-quality synthetic LLM pretraining data

43
Emerging
« Prev 1 2 3 7 8 9 10 11 63 64 65 Next »