All Transformer Models
6,429 models ranked by quality score · Page 16 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 1501 |
AlexanderVNikitin/kernel-language-entropy
Code for Fine-grained Uncertainty Quantification for LLMs from Semantic... |
|
Emerging |
| 1502 |
rbitr/llm.f90
LLM inference in Fortran |
|
Emerging |
| 1503 |
zjohn77/lightning-mlflow-hf
Use QLoRA to tune LLM in PyTorch-Lightning w/ Huggingface + MLflow |
|
Emerging |
| 1504 |
xiangking/prompt_uie_torch
基于PaddleNLP开源的抽取式UIE进行医学命名实体识别(torch实现) |
|
Emerging |
| 1505 |
ksm26/Finetuning-Large-Language-Models
Unlock the potential of finetuning Large Language Models (LLMs). Learn from... |
|
Emerging |
| 1506 |
HomebrewML/HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware. |
|
Emerging |
| 1507 |
litus-ai/classy
classy is a simple-to-use library for building high-performance Machine... |
|
Emerging |
| 1508 |
mantasu/cs224n
Solutions for CS224n (2022) |
|
Emerging |
| 1509 |
lliai/D2MoE
D^2-MoE: Delta Decompression for MoE-based LLMs Compression |
|
Emerging |
| 1510 |
alexeykarnachev/full_stack_transformer
Pytorch library for end-to-end transformer models training, inference and serving |
|
Emerging |
| 1511 |
openshieldai/openshield
OpenShield is a new generation security layer for AI models |
|
Emerging |
| 1512 |
mohyunho/NAS_transformer
Evolutionary Neural Architecture Search on Transformers for RUL Prediction |
|
Emerging |
| 1513 |
GithubX-F/DynaMO-RL
Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization... |
|
Emerging |
| 1514 |
GT-RIPL/robo-vln
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics... |
|
Emerging |
| 1515 |
taishi-i/nagisa_bert
A BERT model for nagisa |
|
Emerging |
| 1516 |
poteminr/instruct-ner
Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models... |
|
Emerging |
| 1517 |
canjiali/PARADE
code and data to faciliate BERT/ELECTRA for document ranking. Details refer... |
|
Emerging |
| 1518 |
user1342/Tomato
LLM steganography with minimum-entropy coupling - Hiding encrypted messages... |
|
Emerging |
| 1519 |
all-things-vits/code-samples
Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and... |
|
Emerging |
| 1520 |
RLado/STB-VMM
STB-VMM: Swin Transformer Based Video Motion Magnification (official repository) |
|
Emerging |
| 1521 |
jhcho99/CoFormer
[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for... |
|
Emerging |
| 1522 |
mu-cai/matryoshka-mm
Matryoshka Multimodal Models |
|
Emerging |
| 1523 |
rkansal47/MPGAN
The message passing GAN https://arxiv.org/abs/2106.11535 and generative... |
|
Emerging |
| 1524 |
Shanghai-Digital-Brain-Laboratory/BDM-DB1
A large-scale multi-modal pre-trained model |
|
Emerging |
| 1525 |
princeton-nlp/LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following |
|
Emerging |
| 1526 |
zd11024/NaviLLM
[CVPR 2024] The code for paper 'Towards Learning a Generalist Model for... |
|
Emerging |
| 1527 |
microsoft/AdaMix
This is the implementation of the paper AdaMix: Mixture-of-Adaptations for... |
|
Emerging |
| 1528 |
eqimp/hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM... |
|
Emerging |
| 1529 |
zjunlp/Mol-Instructions
[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset... |
|
Emerging |
| 1530 |
CTCycle/ADSMOD-Adsorption-Modeling
Streamline adsorption modeling by automatically fitting theoretical... |
|
Emerging |
| 1531 |
bodeby/torchstack
🫧 probability-level model ensembling for transformers |
|
Emerging |
| 1532 |
DebarshiChanda/Amazon-ML-Challenge2021
Scripts and Approach for Amazon ML Challenge |
|
Emerging |
| 1533 |
desaixie/zeroverse
Official code for NeurIPS 2024 paper LRM-Zero: Training Large Reconstruction... |
|
Emerging |
| 1534 |
HKUNLP/icl-ceil
[ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”. |
|
Emerging |
| 1535 |
K-H-Ismail/torchortho
[ICLR 2026] Polynomial, trigonometric, and tropical activations |
|
Emerging |
| 1536 |
joslefaure/HERMES
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes... |
|
Emerging |
| 1537 |
ImKeTT/AdaVAE
[Preprint] AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling... |
|
Emerging |
| 1538 |
babycommando/machinascript-for-robots
Build LLM-powered robots in your garage with MachinaScript For Robots! |
|
Emerging |
| 1539 |
locuslab/massive-activations
Code accompanying the paper "Massive Activations in Large Language Models" |
|
Emerging |
| 1540 |
eduard23144/locoformer
🤖 Explore LocoFormer, a Transformer-XL model that enhances robot locomotion... |
|
Emerging |
| 1541 |
ariya/gamal
Research tool leveraging LLM for answers |
|
Emerging |
| 1542 |
lukechilds/humanscript
A truly natural scripting language |
|
Emerging |
| 1543 |
SALT-NLP/LLaVAR
Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for... |
|
Emerging |
| 1544 |
promptslab/LLMtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text) |
|
Emerging |
| 1545 |
horus-ai-labs/DistillFlow
Library for model distillation |
|
Emerging |
| 1546 |
juyongjiang/CodeUp
CodeUp: A Multilingual Code Generation Llama-X Model with... |
|
Emerging |
| 1547 |
extreme-bert/extreme-bert
ExtremeBERT is a toolkit that accelerates the pretraining of customized... |
|
Emerging |
| 1548 |
sandesha21/Stock-Market-News-Sentiment-Analysis-and-Summarization
NLP pipeline for classifying sentiment in financial news and generating... |
|
Emerging |
| 1549 |
OSU-NLP-Group/AmpleGCG
AmpleGCG: Learning a Universal and Transferable Generator of Adversarial... |
|
Emerging |
| 1550 |
yangjianxin1/Firefly
Firefly:... |
|
Emerging |
| 1551 |
viddexa/moderators
One package to moderate them all |
|
Emerging |
| 1552 |
FuxiaoLiu/LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust... |
|
Emerging |
| 1553 |
volverjs/ai
Hugging Face Transformers.js wrapper for on-device AI with web-workers |
|
Emerging |
| 1554 |
iiis-ai/cumulative-reasoning
[TMLR] Cumulative Reasoning With Large Language Models... |
|
Emerging |
| 1555 |
CVI-SZU/Linly
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集 |
|
Emerging |
| 1556 |
iMoonLab/LLM4Hypergraph
The source code of ICLR 2025 "Beyond Graphs: Can Large Language Models... |
|
Emerging |
| 1557 |
tommyip/mamba2-minimal
Minimal Mamba-2 implementation in PyTorch |
|
Emerging |
| 1558 |
ziegler-ingo/cleavage_benchmark
[BIBM 2025] Code and resources for the paper "Enhancing Multi-Epitope... |
|
Emerging |
| 1559 |
hyintell/awesome-refreshing-llms
EMNLP'23 survey: a curation of awesome papers and resources on refreshing... |
|
Emerging |
| 1560 |
SimeonHristov99/DL_25-26
Practice sessions for the course "Introduction to deep learning" in the... |
|
Emerging |
| 1561 |
huggingface/large_language_model_training_playbook
An open collection of implementation tips, tricks and resources for training... |
|
Emerging |
| 1562 |
GyanPrakashkushwaha/DataScience
EVERYTHING YOU NEED FOR DATA SCIENCE. |
|
Emerging |
| 1563 |
softengg-manoj/dreamer4
🌟 Implement Dreamer 4 for training agents within scalable world models,... |
|
Emerging |
| 1564 |
NVlabs/RocketKV
[ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage... |
|
Emerging |
| 1565 |
ziplab/HVT
[ICCV 2021] Official implementation of "Scalable Vision Transformers with... |
|
Emerging |
| 1566 |
oValach/RailSafeNet
Repository of the paper: RailSafeNet: Visual Scene Understanding for Tram Safety |
|
Emerging |
| 1567 |
FSoft-AI4Code/CodeCapybara
Open-source Self-Instruction Tuning Code LLM |
|
Emerging |
| 1568 |
jaketae/alibi
PyTorch implementation of Train Short, Test Long: Attention with Linear... |
|
Emerging |
| 1569 |
AaronFeng753/Ollama-Model-Dumper
Export and Backup Ollama models into GGUF and ModelFile |
|
Emerging |
| 1570 |
asigalov61/Perceiver-Music-Transformer
SOTA Google's Perceiver-AR Music Transformer Implementation and Model |
|
Emerging |
| 1571 |
Alsace08/Chain-of-Embedding
[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding... |
|
Emerging |
| 1572 |
kaistAI/Janus
[NeurIPS 2024] Train LLMs with diverse system messages reflecting... |
|
Emerging |
| 1573 |
hao-ai-lab/DistCA
Efficient Long-context Language Model Training by Core Attention Disaggregation |
|
Emerging |
| 1574 |
kyegomez/DifferentialTransformer
An open source community implementation of the model from "DIFFERENTIAL... |
|
Emerging |
| 1575 |
DFKI-NLP/thermostat
Collection of NLP model explanations and accompanying analysis tools |
|
Emerging |
| 1576 |
pleisto/yuren-baichuan-7b
基于baichuan-7b的开源多模态大语言模型 |
|
Emerging |
| 1577 |
warner-benjamin/commented-transformers
Highly commented implementations of Transformers in PyTorch |
|
Emerging |
| 1578 |
sinanuozdemir/oreilly-bert-nlp
This repository contains code for the O'Reilly Live Online Training for BERT |
|
Emerging |
| 1579 |
lifeadventurer/sentify
Leveraging Sentiment Analysis on News for Stock Market Insights |
|
Emerging |
| 1580 |
AIFrameResearch/SPO
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL... |
|
Emerging |
| 1581 |
general-preference/general-preference-model
[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for... |
|
Emerging |
| 1582 |
harryjdavies/HeartGPT
Interpretable Pre-Trained Transformers for Heart Time-Series Data |
|
Emerging |
| 1583 |
qingsongedu/Awesome-TimeSeries-SpatioTemporal-LM-LLM
A professional list on Large (Language) Models and Foundation Models (LLM,... |
|
Emerging |
| 1584 |
weiserlab/TinyLLM
Bringing Language Models to the Most Resource Constrained Devices |
|
Emerging |
| 1585 |
styfeng/DataAug4NLP
Collection of papers and resources for data augmentation for NLP. |
|
Emerging |
| 1586 |
zhilizju/Awesome-instruction-tuning
A curated list of awesome instruction tuning datasets, models, papers and... |
|
Emerging |
| 1587 |
DAMO-NLP-SG/LLM-Zoo
LLM Zoo collects information of various open- and close-sourced LLMs |
|
Emerging |
| 1588 |
aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow
This repository contains the implementation of paper Temporal Fusion... |
|
Emerging |
| 1589 |
dravenk/ollama-zig
Ollama Zig library |
|
Emerging |
| 1590 |
epfl-dlab/llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent... |
|
Emerging |
| 1591 |
mrdbourke/mac-ml-speed-test
A few quick scripts focused on testing TensorFlow/PyTorch/Llama 2 on macOS. |
|
Emerging |
| 1592 |
chanind/linear-relational
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs)... |
|
Emerging |
| 1593 |
csiro-robotics/HOTFormerLoc
[IEEE/CVF CVPR 2025] Hierarchical Octree Transformer for Versatile Lidar... |
|
Emerging |
| 1594 |
mala-lab/SEMPO
[NeurIPS 2025] Official implementation of "SEMPO: Lightweight Foundation... |
|
Emerging |
| 1595 |
ahans30/goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs |
|
Emerging |
| 1596 |
mytechnotalent/RE-GPT
Inspired by Andrej Karpathy’s "Let’s Build GPT", this project guides you... |
|
Emerging |
| 1597 |
tlc4418/llm_optimization
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles. |
|
Emerging |
| 1598 |
lechmazur/writing
This benchmark tests how well LLMs incorporate a set of 10 mandatory story... |
|
Emerging |
| 1599 |
yinboc/trans-inr
Transformers as Meta-Learners for Implicit Neural Representations, in ECCV 2022 |
|
Emerging |
| 1600 |
chaitjo/gated-graph-transformers
Transformers are Graph Neural Networks! |
|
Emerging |