Transformer Training Optimization Transformer Models

Tools, frameworks, and techniques for accelerating transformer model training and inference through hardware-specific optimizations, parallelism strategies, and performance tuning. Does NOT include model compression/pruning, application-specific fine-tuning, or inference deployment platforms.

There are 42 transformer training optimization models tracked. 3 score above 70 (verified tier). The highest-rated is huggingface/optimum at 90/100 with 3,325 stars and 1,613,657 monthly downloads. 4 of the top 10 are actively maintained.

Get all 42 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=transformer-training-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 huggingface/optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and...

90
Verified
2 openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

86
Verified
3 NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

76
Verified
4 huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

64
Established
5 RBLN-SW/optimum-rbln

⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN...

61
Established
6 eole-nlp/eole

Open language modeling toolkit based on PyTorch

58
Established
7 huggingface/optimum-habana

Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)

57
Established
8 microsoft/mup

maximal update parametrization (µP)

56
Established
9 olivkoch/nano-trm

An implementation of Tiny Recursive Models (TRM)

44
Emerging
10 NVIDIA-AI-IOT/nanoowl

A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.

41
Emerging
11 AlekseyKorshuk/optimum-transformers

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with...

41
Emerging
12 patil-suraj/onnx_transformers

Accelerated NLP pipelines for fast inference on CPU. Built with Transformers...

39
Emerging
13 huggingface/optimum-graphcore

Blazing fast training of 🤗 Transformers on Graphcore IPUs

39
Emerging
14 LowinLi/fastgpt

⚡ boost inference speed of GPT models in transformers by onnxruntime

39
Emerging
15 xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of...

37
Emerging
16 Jagatmohan46/tiny-recursive-model

🚀 Implement the Tiny Recursive Model (TRM) for improved performance in...

35
Emerging
17 ParCIS/Chimera

Chimera: bidirectional pipeline parallelism for efficiently training...

31
Emerging
18 teelinsan/parallel-decoding

Repository of the paper "Accelerating Transformer Inference for Translation...

28
Experimental
19 Naman-ntc/FastCode

Utilities for efficient fine-tuning, inference and evaluation of code...

26
Experimental
20 rasbt/faster-pytorch-blog

Outlining techniques for improving the training performance of your PyTorch...

25
Experimental
21 alex-snd/TRecover

📜 A python library for distributed training of a Transformer neural network...

25
Experimental
22 jshuadvd/LongRoPE

Implementation of the LongRoPE: Extending LLM Context Window Beyond 2...

24
Experimental
23 sandyresearch/chipmunk

🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E...

24
Experimental
24 XingLuxi/Cal-FLOPs-for-PLM

Calculating FLOPs of Pre-trained Models in NLP

22
Experimental
25 14062/Megatron-LM

Enable large-scale transformer model training with GPU-optimized tools and...

22
Experimental
26 dkurt/optimum-openvino

Intel OpenVINO extension for Hugging Face Transformers

21
Experimental
27 NachoPeinador/FRUGAL_AI_CHIP

FrugalAI Chip: Modular silicon architecture for disposable AI. Achieves...

20
Experimental
28 dzungphieuluuky/OuroTrace

Benchmark and evaluation ByteDance Ouro model based on Looped Language...

20
Experimental
29 dino65-dev/REPO-Attention

RePo: Language Models with Context Re-Positioning by Sakana AI

19
Experimental
30 KimDaeUng/PLM-Implementation

NLP Pretrained Language Models Implementation Study

18
Experimental
31 korovod/kenotron

Experimental fork of Nanotron, a minimalistic large language model...

17
Experimental
32 kyegomez/VO-ROPE

An implementation of the all-new rope from jianlin

14
Experimental
33 christinakim/scaling-laws-for-language-transfer

code for Scaling Laws for Language Transfer Learning

14
Experimental
34 stoyan-stoyanov/transformers-calculator

Transformer Calculator - Estimate training time for transformer models.

13
Experimental
35 luozichen/NeonBench

A systematic study of ultra-tiny language models

12
Experimental
36 supersjgk/Transformers

Playing with Transformers and LLM

12
Experimental
37 mtszkw/fast-torch

Comparing PyTorch, JIT and ONNX for inference with Transformers

12
Experimental
38 Adithya1209/slm-architecture-benchmarks

Comparative study of Linear, MLP, Attention, and Transformer architectures...

11
Experimental
39 sakhileln/rope-pytorch

RoPE Playground – Rotary Positional Embeddings in PyTorch

11
Experimental
40 elvinagam/benchmarking_gpu_inference

Scripts from Neural network inference on Pytorch with tools like ONNX,...

10
Experimental
41 MarkusSagen/Transformers-LM-Benchmark

Benchmark training and inference time for Transformer models on Huggingface

10
Experimental
42 rpatrik96/llm-non-identifiability

Investigating the non-identifiability of Transformers

10
Experimental