jaydeepthik/Nano-GPT

Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture

/ 100

Experimental

This project helps machine learning practitioners or researchers interested in understanding the fundamental building blocks of generative pre-trained transformers (GPTs). It allows you to input text data and observe how a simplified GPT model learns to predict the next characters, producing new text based on the patterns it has learned. This is ideal for those who want to grasp the inner workings of models like ChatGPT at a foundational level.

No commits in the last 6 months.

Use this if you are a machine learning student or researcher seeking to learn and experiment with the core architecture of a GPT model through a simplified implementation.

Not ideal if you are looking for a production-ready, high-performance language model or a tool to perform advanced natural language processing tasks out-of-the-box.

AI-education NLP-research transformer-architecture generative-AI machine-learning-fundamentals

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Jupyter Notebook

License

—

Higher-rated alternatives

tabularis-ai/be_great

A novel approach for synthesizing tabular data using pretrained large language models

EleutherAI/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron...

shibing624/textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...

ai-forever/ru-gpts

Russian GPT3 models.

AdityaNG/kan-gpt

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...

Explore Transformer Models

All categories Trending Transformer directory Insights