jaydeepthik/Nano-GPT
Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture
This project helps machine learning practitioners or researchers interested in understanding the fundamental building blocks of generative pre-trained transformers (GPTs). It allows you to input text data and observe how a simplified GPT model learns to predict the next characters, producing new text based on the patterns it has learned. This is ideal for those who want to grasp the inner workings of models like ChatGPT at a foundational level.
No commits in the last 6 months.
Use this if you are a machine learning student or researcher seeking to learn and experiment with the core architecture of a GPT model through a simplified implementation.
Not ideal if you are looking for a production-ready, high-performance language model or a tool to perform advanced natural language processing tasks out-of-the-box.
Stars
7
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
May 10, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jaydeepthik/Nano-GPT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
tabularis-ai/be_great
A novel approach for synthesizing tabular data using pretrained large language models
EleutherAI/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron...
shibing624/textgen
TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...
ai-forever/ru-gpts
Russian GPT3 models.
AdityaNG/kan-gpt
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...