vraun0/Transformer

Implementation of the paper Attention Is All You Need (2017) in Pytorch, with the multihead attention layers and the encoder and decoder blocks implemented from scratch. The model was trained on the IWSLT 2017 dataset for english to italian translation.

/ 100

Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 2 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Category

transformer-architecture-tutorials

Last pushed

Aug 12, 2025

Commits (30d)

GitHub

Transformer Architecture Tutorials · 313 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/vraun0/Transformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

lucidrains/x-transformers

A concise but complete full-attention transformer with a set of promising experimental features...

kanishkamisra/minicons

Utility for behavioral and representational analyses of Language Models

lucidrains/simple-hierarchical-transformer

Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT

lucidrains/dreamer4

Implementation of Danijar's latest iteration for his Dreamer line of work

Nicolepcx/Transformers-in-Action

This is the corresponding code for the book Transformers in Action

Explore Transformer Models

All categories Trending Transformer directory Insights