Mahesh3394/training-of-transformer-on-dummy-data
Here we try to understand how transformer works and try to replicate architecture from paper published. Also we will train simple architecture on dummy dataset.
No commits in the last 6 months.
Stars
1
Forks
—
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Nov 24, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Mahesh3394/training-of-transformer-on-dummy-data"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
huggingface/text-generation-inference
Large Language Model Text Generation Inference
OpenMachine-ai/transformer-tricks
A collection of tricks and tools to speed up transformer models
poloclub/transformer-explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
IBM/TabFormer
Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)
tensorgi/TPA
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6)...