Adithya1209/slm-architecture-benchmarks

Comparative study of Linear, MLP, Attention, and Transformer architectures for character and word-level language modeling. Analyzing scaling laws, FLOPs-to-performance trade-offs, and generalization gaps on Tiny Shakespeare, PTB, and WikiText-2.

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

Python

License

—

Category

transformer-training-optimization

Last pushed

Jan 15, 2026

Commits (30d)

GitHub

Transformer Training Optimization · 42 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Adithya1209/slm-architecture-benchmarks"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

huggingface/optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers...

NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

eole-nlp/eole

Open language modeling toolkit based on PyTorch

Explore Transformer Models

All categories Trending Transformer directory Insights