All Transformer Models
6,429 models ranked by quality score · Page 21 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 2001 |
myscience/x-lstm
Pytorch implementation of the xLSTM model by Beck et al. (2024) |
|
Emerging |
| 2002 |
vivy-yi/awesome-llm-training-inference
Curated list of LLM training and inference frameworks, tools, and resources.... |
|
Emerging |
| 2003 |
nareshis21/Truelarge-RT
Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices.... |
|
Emerging |
| 2004 |
microsoft/encoder-decoder-slm
Efficient encoder-decoder architecture for small language models (≤1B... |
|
Emerging |
| 2005 |
mdegans/drama_llama
Yet another `llama.cpp` Rust wrapper |
|
Emerging |
| 2006 |
yaph/charla
A terminal based chat application that works with AI language models. |
|
Emerging |
| 2007 |
QwenLM/ParScale
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling |
|
Emerging |
| 2008 |
Simplifine-gamedev/Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud... |
|
Emerging |
| 2009 |
urban-mobility-generation/Language-Modeling-for-Urban-Mobility
Language Modeling for Urban Mobility: A Data-Centric Review and Guidelines |
|
Emerging |
| 2010 |
CogitatorTech/zigformer
An educational transformer-based LLM in pure Zig |
|
Emerging |
| 2011 |
rhnfzl/SqueakyCleanText
Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble,... |
|
Emerging |
| 2012 |
luffycodes/Tutorbot-Spock
An Education Tutoring Chatbot based on Learning Science Principles powered... |
|
Emerging |
| 2013 |
yaodongC/awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs... |
|
Emerging |
| 2014 |
LorenzoAgnolucci/BERT_for_ABSA
In this work (Targeted) Aspect-Based Sentiment Analysis task is converted to... |
|
Emerging |
| 2015 |
datawhalechina/unlock-hf
解锁HuggingFace生态的百般用法 |
|
Emerging |
| 2016 |
HeegyuKim/language-model
한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate) |
|
Emerging |
| 2017 |
Mmorgan-ML/Neuromodulatory-Control-Networks
Neuromodulatory Control Networks (NCNs), a novel LLM architectural... |
|
Emerging |
| 2018 |
Koratahiu/Advanced_Optimizers
A family of highly efficient, lightweight yet powerful optimizers. |
|
Emerging |
| 2019 |
zabir-nabil/awesome-multilingual-large-language-models
A comprehensive collection of multilingual datasets and large language... |
|
Emerging |
| 2020 |
takara-ai/SwarmFormer
A pytorch implementation of SwarmFormer for text classification. |
|
Emerging |
| 2021 |
armbues/SiLLM-examples
Examples for using the SiLLM framework for training and running Large... |
|
Emerging |
| 2022 |
sayakpaul/deploy-hf-tf-vision-models
This repository shows various ways of deploying a vision model (TensorFlow)... |
|
Emerging |
| 2023 |
seongminp/transformers-into-vaes
Code for "Finetuning Pretrained Transformers into Variational Autoencoders" |
|
Emerging |
| 2024 |
mshenoda/roberta-spam
RoBERTa based Spam Message Detection |
|
Emerging |
| 2025 |
rezazad68/TMUnet
Contextual Attention Network: Transformer Meets U-Net |
|
Emerging |
| 2026 |
liuyang-ict/SAP-DETR
[CVPR 2023] Official implementation of "SAP-DETR: Bridging the Gap between... |
|
Emerging |
| 2027 |
XunshanMan/MVGFormer
This is the official implementation of the work presented at CVPR 2024,... |
|
Emerging |
| 2028 |
clip-italian/clip-italian
CLIP (Contrastive Language–Image Pre-training) for Italian |
|
Emerging |
| 2029 |
AILab-CVC/M2PT
[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data... |
|
Emerging |
| 2030 |
oxidized-transformers/oxidized-transformers
Modular Rust transformer/LLM library using Candle |
|
Emerging |
| 2031 |
NSLab-CUK/Unified-Graph-Transformer
Unified Graph Transformer (UGT) is a novel Graph Transformer model... |
|
Emerging |
| 2032 |
tsinghua-fib-lab/UniST
Official implementation for "UniST: A Prompt-Empowered Universal Model for... |
|
Emerging |
| 2033 |
sanjaradylov/smiles-gpt
Generative Pre-Training from Molecules |
|
Emerging |
| 2034 |
remixer-dec/botality-ii
telegram bot for self-hosted local inference of stable diffusion,... |
|
Emerging |
| 2035 |
SapienzaNLP/ita-bench
A collection of Italian benchmarks for LLM evaluation |
|
Emerging |
| 2036 |
BFCmath/FinetuneAI_Learning
How to effectively finetune CV/LLM models (without local gpu) |
|
Emerging |
| 2037 |
CJReinforce/PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is... |
|
Emerging |
| 2038 |
soumyadip1995/BabyGPT
Something in the middle of Karpathy's mingpt model and video lectures, ... |
|
Emerging |
| 2039 |
yinzhangyue/SelfAware
Do Large Language Models Know What They Don’t Know? |
|
Emerging |
| 2040 |
Nota-NetsPresso/shortened-llm
Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop] |
|
Emerging |
| 2041 |
GunjanDhanuka/stocks-trading-bot
A multi-purpose repository with Sentiment Analysis of Stocks news, and... |
|
Emerging |
| 2042 |
Buyun-Liang/SECA
[NeurIPS 2025] SECA: Semantically Equivalent and Coherent Attacks for... |
|
Emerging |
| 2043 |
rti/gptvis
Understanding Transformers Using A Minimal Example |
|
Emerging |
| 2044 |
fvliang/DART
Official Implementation of DART (DART: Diffusion-Inspired Speculative... |
|
Emerging |
| 2045 |
ShiZhengyan/InstructionModelling
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With... |
|
Emerging |
| 2046 |
albrateanu/ModalFormer
[2025] ModalFormer: Multimodal Transformer for Low-Light Image Enhancement |
|
Emerging |
| 2047 |
katha-ai/EmoTx-CVPR2023
[CVPR 2023] Official code repository for "How you feelin'? Learning Emotions... |
|
Emerging |
| 2048 |
leondz/lm_risk_cards
Risks and targets for assessing LLMs & LLM vulnerabilities |
|
Emerging |
| 2049 |
BillChan226/HALC
[ICML 2024] Official implementation for "HALC: Object Hallucination... |
|
Emerging |
| 2050 |
RishabSA/Sketch2Graphviz
Sketch2Graphviz allows you to convert sketches or images of graphs and... |
|
Emerging |
| 2051 |
robert-mcdermott/ollama-batch-cluster
Large Scale Batch Processing with Ollama |
|
Emerging |
| 2052 |
praj2408/Text-Summarizer-Project
The text summarizer project is an innovative tool designed to condense... |
|
Emerging |
| 2053 |
FutureComputing4AI/HGConv
HGConv: Holographic Global Convolutional Networks |
|
Emerging |
| 2054 |
amazon-science/text_generation_diffusion_llm_topic
Topic Embedding, Text Generation and Modeling using diffusion |
|
Emerging |
| 2055 |
IDSIA/automated-cl
Official repository for the paper "Automating Continual Learning" |
|
Emerging |
| 2056 |
IDSIA/lmtool-fwp
PyTorch Language Modeling Toolkit for Fast Weight Programmers |
|
Emerging |
| 2057 |
waltonfuture/InstructionGPT-4
InstructionGPT-4 |
|
Emerging |
| 2058 |
laelhalawani/gguf_llama
Wrapper for simplified use of Llama2 GGUF quantized models. |
|
Emerging |
| 2059 |
umitkacar/pytorch-interactive-learning
Professional PyTorch CLI learning tool with 24 comprehensive lessons - From... |
|
Emerging |
| 2060 |
yangjianxin1/Firefly-LLaMA2-Chinese
Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、Intern... |
|
Emerging |
| 2061 |
wangxiao5791509/MultiModal_BigModels_Survey
[MIR-2023-Survey] A continuously updated paper list for multi-modal... |
|
Emerging |
| 2062 |
Nithin-Holla/meme_challenge
Repository containing code from team Kingsterdam for the Hateful Memes Challenge |
|
Emerging |
| 2063 |
StyrbjornKall/TRIDENT
A collection of transformer-based models and developmental scripts presented... |
|
Emerging |
| 2064 |
The-Swarm-Corporation/Hyena-Y
A PyTorch implementation of the Hyena-Y model, a convolution-based... |
|
Emerging |
| 2065 |
templetwo/PhaseGPT
Kuramoto Phase-Coupled Oscillator Attention in Transformers |
|
Emerging |
| 2066 |
naokishibuya/simple_transformer
A Transformer Implementation that is easy to understand and customizable. |
|
Emerging |
| 2067 |
nihalsangeeth/behaviour-seq-transformer
Pytorch implementation of "Behaviour Sequence Transformer for E-commerce... |
|
Emerging |
| 2068 |
cyk1337/Transformer-in-PyTorch
Transformer/Transformer-XL/R-Transformer examples and explanations |
|
Emerging |
| 2069 |
yueliu1999/FlipAttack
[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs... |
|
Emerging |
| 2070 |
yao8839836/kg-llm
Exploring large language models for knowledge graph completion. ICASSP 2025 |
|
Emerging |
| 2071 |
anyantudre/Florence-2-Vision-Language-Model
Florence-2 is a novel vision foundation model with a unified, prompt-based... |
|
Emerging |
| 2072 |
fran-martinez/bio_ner_bert
BERT finetuned on NER downstream tasks |
|
Emerging |
| 2073 |
rasbt/faster-pytorch-blog
Outlining techniques for improving the training performance of your PyTorch... |
|
Emerging |
| 2074 |
mlverse/mall
Run multiple LLM predictions against a data frame with R and Python |
|
Emerging |
| 2075 |
mytechnotalent/falcongpt
Simple GPT app that uses the falcon-7b-instruct model with a Flask front-end. |
|
Emerging |
| 2076 |
aquadzn/deploy-transformers
Easily deploy a state-of-the-art language model from HuggingFace's Transformers |
|
Emerging |
| 2077 |
BorealisAI/flora-opt
This is the official repository for the paper "Flora: Low-Rank Adapters Are... |
|
Emerging |
| 2078 |
alvion427/PerroPastor
Run Llama based LLMs in Unity entirely in compute shaders with no dependencies |
|
Emerging |
| 2079 |
midway2333/Tower2
多模态语言模型架构 |
|
Emerging |
| 2080 |
zubair-irshad/NeRF-MAE
[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders... |
|
Emerging |
| 2081 |
dsdanielpark/open-llm-datasets
Repository for organizing datasets and papers used in Open LLM. |
|
Emerging |
| 2082 |
SafeAILab/RAIN
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning |
|
Emerging |
| 2083 |
deep-symbolic-mathematics/llm-srbench
[ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation... |
|
Emerging |
| 2084 |
HKUNLP/efficient-attention
[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control... |
|
Emerging |
| 2085 |
synlp/R2-LLM
The official GitHub repository of the AAAI-2024 paper "Bootstrapping Large... |
|
Emerging |
| 2086 |
azminewasi/Awesome-LLMs-ICLR-24
It is a comprehensive resource hub compiling all LLM papers accepted at the... |
|
Emerging |
| 2087 |
open-compass/ANAH
[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO |
|
Emerging |
| 2088 |
mohd-faizy/06P_Sentiment-Analysis-With-Deep-Learning-Using-BERT
Finetuning BERT in PyTorch for sentiment analysis. |
|
Emerging |
| 2089 |
The-Martyr/CausalMM
[ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal... |
|
Emerging |
| 2090 |
smvorwerk/xlstm-cuda
Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and... |
|
Emerging |
| 2091 |
Qwen-Applications/CLIPO
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR |
|
Emerging |
| 2092 |
SalesforceAIResearch/Elastic-Reasoning
Make reasoning models scalable |
|
Emerging |
| 2093 |
arshadshk/Last_Query_Transformer_RNN-PyTorch
Implementation of the paper "Last Query Transformer RNN for knowledge... |
|
Emerging |
| 2094 |
anas-zafar/LLM-Survey
The official GitHub page for the survey paper "A Survey on Large Language... |
|
Emerging |
| 2095 |
fajri91/sum_liputan6
The first large-scale summarization corpus for the Indonesian language. AACL 2020. |
|
Emerging |
| 2096 |
cmu-flame/FLAME-MoE
Official repository for FLAME-MoE: A Transparent End-to-End Research... |
|
Emerging |
| 2097 |
ymoslem/Adaptive-MT-LLM
Adaptive Machine Translation with Large Language Models |
|
Emerging |
| 2098 |
HxCodeWarrior/StellarByte
从零实现基础的Transformer的Decoerder-Only模型,并进行模型升级,构建专属于自己的LLM模型 |
|
Emerging |
| 2099 |
TIGER-AI-Lab/TIGERScore
"TIGERScore: Towards Building Explainable Metric for All Text Generation... |
|
Emerging |
| 2100 |
xmartlabs/spoter-embeddings
Create embeddings from sign pose videos using Transformers |
|
Emerging |