pbloem/former

Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)

/ 100

Emerging

Built entirely in PyTorch without external NLP libraries, this implementation covers the complete transformer stack including multi-head attention, positional encoding, and feed-forward layers with detailed comments explaining each component. The codebase prioritizes educational clarity over performance optimization, making it useful for understanding transformer mechanics rather than production deployment.

1,092 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

1,092

Forks

172

Language

Python

License

MIT

Higher-rated alternatives

huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...

100

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

ARM-software/keyword-transformer

Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769

IBM/regression-transformer

Regression Transformer (2023; Nature Machine Intelligence)

Explore Transformer Models

All categories Trending Transformer directory Insights