kyegomez/GPT3

An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"

/ 100

Emerging

This project offers a foundational implementation of the GPT-3 language model architecture, enabling you to build powerful language understanding and generation systems. It takes in raw text or numerical representations of text and produces contextually relevant text outputs. Researchers and engineers working with large language models would use this to experiment with few-shot learning capabilities for various NLP tasks.

No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer looking to understand, replicate, or build upon the core GPT-3 architecture for advanced natural language processing tasks.

Not ideal if you are a practitioner looking for a ready-to-use, pre-trained GPT-3 model for immediate application without deep technical setup or customization.

Natural Language Processing Large Language Models Few-Shot Learning AI Research Text Generation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

jingyaogong/minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

kyegomez/TeraGPT

Train a production grade GPT in less than 400 lines of code. Better than Karpathy's verison and GIGAGPT

theosorus/GPT2-Hasktorch

GPT2 implementation in Haskell with the Hasktorch library, inspired by Andrej Karpathy's Pytorch...

noah-hein/mazeGPT

AI model for making mazes that extends OpenAIs GPT2 model

RohitPawar001/GPT-2-Implementation

This repository contains the implementation of OpenAI's GPT-2 with LORA, QLORA, RLHF, PPO,GRPO,...

Explore Transformer Models

All categories Trending Transformer directory Insights