pbloem/former
Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)
Built entirely in PyTorch without external NLP libraries, this implementation covers the complete transformer stack including multi-head attention, positional encoding, and feed-forward layers with detailed comments explaining each component. The codebase prioritizes educational clarity over performance optimization, making it useful for understanding transformer mechanics rather than production deployment.
1,092 stars. No commits in the last 6 months.
Stars
1,092
Forks
172
Language
Python
License
MIT
Category
Last pushed
Mar 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/pbloem/former"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
huggingface/transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
ARM-software/keyword-transformer
Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769
IBM/regression-transformer
Regression Transformer (2023; Nature Machine Intelligence)