Transformer Training Optimization Transformer Models
Tools, frameworks, and techniques for accelerating transformer model training and inference through hardware-specific optimizations, parallelism strategies, and performance tuning. Does NOT include model compression/pruning, application-specific fine-tuning, or inference deployment platforms.
There are 42 transformer training optimization models tracked. 3 score above 70 (verified tier). The highest-rated is huggingface/optimum at 90/100 with 3,325 stars and 1,613,657 monthly downloads. 4 of the top 10 are actively maintained.
Get all 42 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=transformer-training-optimization&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
huggingface/optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and... |
|
Verified |
| 2 |
openvinotoolkit/nncf
Neural Network Compression Framework for enhanced OpenVINO™ inference |
|
Verified |
| 3 |
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale |
|
Verified |
| 4 |
huggingface/optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools |
|
Established |
| 5 |
RBLN-SW/optimum-rbln
⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN... |
|
Established |
| 6 |
eole-nlp/eole
Open language modeling toolkit based on PyTorch |
|
Established |
| 7 |
huggingface/optimum-habana
Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU) |
|
Established |
| 8 |
microsoft/mup
maximal update parametrization (µP) |
|
Established |
| 9 |
olivkoch/nano-trm
An implementation of Tiny Recursive Models (TRM) |
|
Emerging |
| 10 |
NVIDIA-AI-IOT/nanoowl
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT. |
|
Emerging |
| 11 |
AlekseyKorshuk/optimum-transformers
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with... |
|
Emerging |
| 12 |
patil-suraj/onnx_transformers
Accelerated NLP pipelines for fast inference on CPU. Built with Transformers... |
|
Emerging |
| 13 |
huggingface/optimum-graphcore
Blazing fast training of 🤗 Transformers on Graphcore IPUs |
|
Emerging |
| 14 |
LowinLi/fastgpt
⚡ boost inference speed of GPT models in transformers by onnxruntime |
|
Emerging |
| 15 |
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of... |
|
Emerging |
| 16 |
Jagatmohan46/tiny-recursive-model
🚀 Implement the Tiny Recursive Model (TRM) for improved performance in... |
|
Emerging |
| 17 |
ParCIS/Chimera
Chimera: bidirectional pipeline parallelism for efficiently training... |
|
Emerging |
| 18 |
teelinsan/parallel-decoding
Repository of the paper "Accelerating Transformer Inference for Translation... |
|
Experimental |
| 19 |
Naman-ntc/FastCode
Utilities for efficient fine-tuning, inference and evaluation of code... |
|
Experimental |
| 20 |
rasbt/faster-pytorch-blog
Outlining techniques for improving the training performance of your PyTorch... |
|
Experimental |
| 21 |
alex-snd/TRecover
📜 A python library for distributed training of a Transformer neural network... |
|
Experimental |
| 22 |
jshuadvd/LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2... |
|
Experimental |
| 23 |
sandyresearch/chipmunk
🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E... |
|
Experimental |
| 24 |
XingLuxi/Cal-FLOPs-for-PLM
Calculating FLOPs of Pre-trained Models in NLP |
|
Experimental |
| 25 |
14062/Megatron-LM
Enable large-scale transformer model training with GPU-optimized tools and... |
|
Experimental |
| 26 |
dkurt/optimum-openvino
Intel OpenVINO extension for Hugging Face Transformers |
|
Experimental |
| 27 |
NachoPeinador/FRUGAL_AI_CHIP
FrugalAI Chip: Modular silicon architecture for disposable AI. Achieves... |
|
Experimental |
| 28 |
dzungphieuluuky/OuroTrace
Benchmark and evaluation ByteDance Ouro model based on Looped Language... |
|
Experimental |
| 29 |
dino65-dev/REPO-Attention
RePo: Language Models with Context Re-Positioning by Sakana AI |
|
Experimental |
| 30 |
KimDaeUng/PLM-Implementation
NLP Pretrained Language Models Implementation Study |
|
Experimental |
| 31 |
korovod/kenotron
Experimental fork of Nanotron, a minimalistic large language model... |
|
Experimental |
| 32 |
kyegomez/VO-ROPE
An implementation of the all-new rope from jianlin |
|
Experimental |
| 33 |
christinakim/scaling-laws-for-language-transfer
code for Scaling Laws for Language Transfer Learning |
|
Experimental |
| 34 |
stoyan-stoyanov/transformers-calculator
Transformer Calculator - Estimate training time for transformer models. |
|
Experimental |
| 35 |
luozichen/NeonBench
A systematic study of ultra-tiny language models |
|
Experimental |
| 36 |
supersjgk/Transformers
Playing with Transformers and LLM |
|
Experimental |
| 37 |
mtszkw/fast-torch
Comparing PyTorch, JIT and ONNX for inference with Transformers |
|
Experimental |
| 38 |
Adithya1209/slm-architecture-benchmarks
Comparative study of Linear, MLP, Attention, and Transformer architectures... |
|
Experimental |
| 39 |
sakhileln/rope-pytorch
RoPE Playground – Rotary Positional Embeddings in PyTorch |
|
Experimental |
| 40 |
elvinagam/benchmarking_gpu_inference
Scripts from Neural network inference on Pytorch with tools like ONNX,... |
|
Experimental |
| 41 |
MarkusSagen/Transformers-LM-Benchmark
Benchmark training and inference time for Transformer models on Huggingface |
|
Experimental |
| 42 |
rpatrik96/llm-non-identifiability
Investigating the non-identifiability of Transformers |
|
Experimental |