All Transformer Models

6,429 models ranked by quality score · Page 10 of 65

Showing 901–1000 of 6,429
# Model Score Tier
901 jianghoucheng/AnyEdit

AnyEdit: Edit Any Knowledge Encoded in Language Models, ICML 2025

43
Emerging
902 huangwl18/language-planner

Official Code for "Language Models as Zero-Shot Planners: Extracting...

43
Emerging
903 Intelligent-CAT-Lab/PLTranslationEmpirical

Artifact repository for the paper "Lost in Translation: A Study of Bugs...

43
Emerging
904 elicit/machine-learning-list

A curriculum for learning about foundation models, from scratch to the frontier

43
Emerging
905 WangRongsheng/ChatGenTitle

🌟 ChatGenTitle:使用百万arXiv论文信息在LLaMA模型上进行微调的论文题目生成模型

43
Emerging
906 MozerWang/AMPO

[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents

43
Emerging
907 A-baoYang/alpaca-7b-chinese

Finetune LLaMA-7B with Chinese instruction datasets

42
Emerging
908 therealoliver/Deepdive-llama3-from-scratch

Achieve the llama3 inference step-by-step, grasp the core concepts, master...

42
Emerging
909 johnmai-dev/ChatMLX

🤖✨ChatMLX is a modern, open-source, high-performance chat application for...

42
Emerging
910 oxpig/CaLM

Protein language model trained on coding DNA

42
Emerging
911 Gunale0926/SORSA

SORSA: Singular Values and Orthonormal Regularized Singular Vectors...

42
Emerging
912 microsoft/interwhen

A framework for verifiable reasoning with language models.

42
Emerging
913 tosiyuki/LLaVA-JP

LLaVA-JP is a Japanese VLM trained by LLaVA method

42
Emerging
914 amirfeder/CausaLM

CausaLM: Causal Model Explanation Through Counterfactual Language Models

42
Emerging
915 gaussalgo/adaptor

ACL 2022: Adaptor: a library to easily adapt a language model to your own...

42
Emerging
916 EleutherAI/DALLE-mtf

Open-AI's DALL-E for large scale training in mesh-tensorflow.

42
Emerging
917 TextGeneratorio/text-generator.io

Run Vision LLMs, TTS and STT APIs. Website and API for https://text-generator.io

42
Emerging
918 sayakpaul/robustness-vit

Contains code for the paper "Vision Transformers are Robust Learners" (AAAI 2022).

42
Emerging
919 Alpha-VLLM/Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

42
Emerging
920 grctest/FastAPI-BitNet

Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.

42
Emerging
921 ArdaGnsrn/ollama-php

This is a PHP library for Ollama. Ollama is an open-source project that...

42
Emerging
922 biodatlab/thonburian-whisper

Thonburian Whisper: Open models for fine-tuned Whisper in Thai. Try our demo...

42
Emerging
923 ZO-Bench/ZO-LLM

[ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization...

42
Emerging
924 AntixK/PyTorch-Model-Compare

Compare neural networks by their feature similarity

42
Emerging
925 Dartvauder/NeuroSandboxWebUI

(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image,...

42
Emerging
926 Yachay-AI/byt5-geotagging

Confidence and Byt5 - based geotagging model predicting coordinates from text alone.

42
Emerging
927 CVxTz/music_genre_classification

music genre classification : LSTM vs Transformer

42
Emerging
928 bilibili/Index-1.9B

A lightweight multilingual LLM

42
Emerging
929 ivanfioravanti/wine_variety_classification

Examples on how to use various LLM providers with a Wine Classification problem

42
Emerging
930 nova-land/gbnf-compiler

Plug n Play GBNF Compiler for llama.cpp

42
Emerging
931 CAMeL-Lab/CAMeLBERT

Code and models for "The Interplay of Variant, Size, and Task Type in Arabic...

42
Emerging
932 HamedBabaei/LLMs4OM

LLMs4OM: Matching Ontologies with Large Language Models

42
Emerging
933 hyperonym/basaran

Basaran is an open-source alternative to the OpenAI text completion API. It...

42
Emerging
934 sinanuozdemir/oreilly-ai-pipelines

Designing and Deploying LLM Pipelines

42
Emerging
935 softmax1/Flash-Attention-Softmax-N

CUDA and Triton implementations of Flash Attention with SoftmaxN.

42
Emerging
936 waikato-llm/llm-dataset-converter

For converting LLM datasets from one format into another.

42
Emerging
937 aimclub/FEDOT.LLM

LLM-based prototype for nexgen AutoML

42
Emerging
938 ZinYY/Online_RLHF

A PyTorch implementation of the paper "Provably Efficient Online RLHF with...

42
Emerging
939 bhavnicksm/vanilla-transformer-jax

JAX/Flax implimentation of 'Attention Is All You Need' by Vaswani et al....

42
Emerging
940 josStorer/selfhostedAI

A collection of one-click self-hosted AI

42
Emerging
941 HyperCluster-Tech/manimator

Transform research papers and mathematical concepts into stunning visual...

42
Emerging
942 xyjigsaw/LLM-Pretrain-SFT

Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)

42
Emerging
943 kyegomez/qformer

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

42
Emerging
944 AmpereComputingAI/ampere_model_library

AML's goal is to make benchmarking of various AI architectures on Ampere...

42
Emerging
945 EncrEor/rlm-claude

Recursive Language Models for Claude Code - Infinite memory solution...

42
Emerging
946 shivendrra/SmallLanguageModel

a LLM cookbook, for building your own from scratch, all the way from...

42
Emerging
947 efeslab/fiddler

[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration

42
Emerging
948 thunlp/InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for...

42
Emerging
949 NVIDIA/Cosmos-Tokenizer

A suite of image and video neural tokenizers

42
Emerging
950 palewire/first-llm-classifier

Learn how journalists use large-language models to organize and analyze...

42
Emerging
951 ByteDance-Seed/FlexPrefill

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse...

42
Emerging
952 monologg/GoEmotions-Korean

Korean version of GoEmotions Dataset 😍😢😱

42
Emerging
953 zjunlp/KnowledgeCircuits

[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers

42
Emerging
954 uclaml/SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

42
Emerging
955 LLukas22/llm-rs-python

Unofficial python bindings for the rust llm library. 🐍❤️🦀

42
Emerging
956 RobertCsordas/ndr

The official repository for our paper "The Neural Data Router: Adaptive...

42
Emerging
957 RobertCsordas/transformer_generalization

The official repository for our paper "The Devil is in the Detail: Simple...

42
Emerging
958 shushanxingzhe/transformers_ner

Add CRF or LSTM+CRF for huggingface transformers bert to perform better on...

42
Emerging
959 AviSoori1x/seemore

From scratch implementation of a vision language model in pure PyTorch

42
Emerging
960 calcuis/gguf-core

a simple way to interact llama with gguf

42
Emerging
961 garyb9/twitter-llm-bot

Fully automatic asynchronous AI operated Twitter bot using Large Language...

42
Emerging
962 sedthh/BeatLearning

Open Source Generative AI Models for Automatic Rhythm Game Beatmap...

42
Emerging
963 nlp-uoregon/mlmm-evaluation

Multilingual Large Language Models Evaluation Benchmark

42
Emerging
964 canyuchen/ClinicalBench

Code for the paper "ClinicalBench: Can LLMs Beat Traditional ML Models in...

42
Emerging
965 golsun/DialogRPT

EMNLP 2020: "Dialogue Response Ranking Training with Large-Scale Human Feedback Data"

42
Emerging
966 ai-forever/mgpt

Multilingual Generative Pretrained Model

42
Emerging
967 asigalov61/SuperPiano

Absolutely amazing SOTA Google Colab (Jupyter) Notebooks for...

42
Emerging
968 monologg/DistilKoBERT

Distillation of KoBERT from SKTBrain (Lightweight KoBERT)

42
Emerging
969 jaisidhsingh/pytorch-mixtures

One-stop solutions for Mixture of Expert modules in PyTorch.

42
Emerging
970 lamalab-org/MatText

Text-based modeling of materials.

42
Emerging
971 princeton-nlp/LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via...

42
Emerging
972 Dicklesworthstone/llm_introspective_compression_and_metacognition

A novel approach for transformer model introspection that enables saving,...

42
Emerging
973 AbdelStark/attnres

Rust implementation of Attention Residuals from MoonshotAI/Kimi

42
Emerging
974 ChanithaAbey/AI-Agent-for-Stock-Prediction

An AI Agent for stock data analysis, news rerieval, and prediction; powered...

42
Emerging
975 sail-sg/understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

42
Emerging
976 ayaka14732/llama-2-jax

JAX implementation of the Llama 2 model

42
Emerging
977 harleyszhang/lite_llama

A light llama-like llm inference framework based on the triton kernel.

42
Emerging
978 illiterate/BertClassifier

基于PyTorch的BERT中文文本分类模型(BERT Chinese text classification model implemented by PyTorch)

42
Emerging
979 the-crypt-keeper/can-ai-code

Self-evaluating interview for AI coders

42
Emerging
980 westlake-repl/IDvs.MoRec

End-to-end Training for Multimodal Recommendation Systems

42
Emerging
981 lenguajenatural-ai/autotransformers

A Python package for automatically training and comparing language models.

42
Emerging
982 jingedawang/TutorialLLM

LLM Tutorial for Everyone.

42
Emerging
983 hellotransformers/Natural_Language_Processing_with_Transformers

Natural Language Processing with Transformers 中译本,最权威Transformers教程

42
Emerging
984 gotzmann/llama.go

llama.go is like llama.cpp in pure Golang!

42
Emerging
985 mojivalipour/symbolicgpt

Symbolic regression is the task of identifying a mathematical expression...

42
Emerging
986 njchoma/transformer_image_caption

Image Captioning based on Bottom-Up and Top-Down Attention model

42
Emerging
987 jankais3r/LLaMA_MPS

Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.

42
Emerging
988 leehanchung/lora-instruct

Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA

42
Emerging
989 menon92/BangalASR

Transformer based Bangla Speech Recognition | Encoder Decoder Architecture

42
Emerging
990 ssbuild/deep_training

deep learning

42
Emerging
991 gitctrlx/llama.go

Llama from scratch in Go.

42
Emerging
992 sinanuozdemir/foundations-of-gen-ai

Transformer Architectures for Generative AI

42
Emerging
993 ruanchaves/napolab

The Natural Portuguese Language Benchmark (Napolab). Stay up to date with...

42
Emerging
994 harleyszhang/llm_note

LLM notes, including model inference, transformer model structure, and llm...

42
Emerging
995 argosopentech/MetalTranslate

Customizable machine translation in C++

42
Emerging
996 mbzuai-oryx/Awesome-LLM-Post-training

Awesome Reasoning LLM Tutorial/Survey/Guide

42
Emerging
997 dohlee/chromoformer

The official code implementation for Chromoformer in PyTorch. (Lee et al.,...

42
Emerging
998 zetavg/LLaMA-LoRA-Tuner

UI tool for fine-tuning and testing your own LoRA models base on LLaMA,...

42
Emerging
999 NVlabs/GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges...

42
Emerging
1000 waltonfuture/Diabetica

[SCI-FM@ICLR 2025] Specialized LLMs capable of handling various diabetes tasks

42
Emerging
« Prev 1 2 3 8 9 10 11 12 63 64 65 Next »