All Transformer Models
6,429 models ranked by quality score · Page 15 of 65
| # | Model | Score | Tier |
|---|---|---|---|
| 1401 |
JosefAlbers/VL-JEPA
VL-JEPA (Vision-Language Joint Embedding Predictive Architecture) in MLX |
|
Emerging |
| 1402 |
zjukg/KoPA
[Paper][ACM MM 2024] Making Large Language Models Perform Better in... |
|
Emerging |
| 1403 |
nlpaueb/greek-bert
A Greek edition of BERT pre-trained language model |
|
Emerging |
| 1404 |
wenge-research/YAYI
雅意大模型:为客户打造安全可靠的专属大模型,基于大规模中英文多领域指令数据训练的 LlaMA 2 & BLOOM... |
|
Emerging |
| 1405 |
pmichel31415/are-16-heads-really-better-than-1
Code for the paper "Are Sixteen Heads Really Better than One?" |
|
Emerging |
| 1406 |
ahmetkumass/yolo-gen
Train YOLO + VLM with one command. Auto-generate vision-language training... |
|
Emerging |
| 1407 |
TrelisResearch/install-guides
Various installation guides for Large Language Models |
|
Emerging |
| 1408 |
SqueezeAILab/LLM2LLM
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement |
|
Emerging |
| 1409 |
ccdv-ai/convert_checkpoint_to_lsg
Efficient Attention for Long Sequence Processing |
|
Emerging |
| 1410 |
NisaarAgharia/Indian-LawyerGPT
Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model... |
|
Emerging |
| 1411 |
sayakpaul/probing-vits
Probing the representations of Vision Transformers. |
|
Emerging |
| 1412 |
taishan1994/LLM-Quantization
记录量化LLM中的总结。 |
|
Emerging |
| 1413 |
NiuTrans/LMT
Building a inclusive, scalable, and high-performance multilingual translation model |
|
Emerging |
| 1414 |
chef-transformer/chef-transformer
Chef Transformer 🍲 . |
|
Emerging |
| 1415 |
EagleW/Scientific-Inspiration-Machines-Optimized-for-Novelty
Official implementation of the ACL 2024: Scientific Inspiration Machines... |
|
Emerging |
| 1416 |
biswassanket/DocSegTr
A Bottom-Up Instance Segmentation Strategy for segmenting document instances... |
|
Emerging |
| 1417 |
raimondilab/precogx
A predictor of GPCR couplings with G-proteins/B-arrs using Transformers |
|
Emerging |
| 1418 |
BaiTheBest/SparseLLM
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024) |
|
Emerging |
| 1419 |
jaco-bro/MLX.zig
MLX.zig: Phi-4, Llama 3.2, and Whisper in Zig |
|
Emerging |
| 1420 |
VITA-Group/LiGO
[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer... |
|
Emerging |
| 1421 |
sotiraslab/AgileFormer
This the repo for the paper tiltled "AgileFormer: Spatially Agile... |
|
Emerging |
| 1422 |
ashleykleynhans/text-generation-docker
Docker image for the Text Generation Web UI: A Gradio web UI for Large... |
|
Emerging |
| 1423 |
ymoslem/Adaptive-MT-LLM-Fine-tuning
Fine-tuning Open-Source LLMs for Adaptive Machine Translation |
|
Emerging |
| 1424 |
NVlabs/EoRA
[ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with... |
|
Emerging |
| 1425 |
zhanshijinwat/Steel-LLM
Train a 1B LLM with 1T tokens from scratch by personal |
|
Emerging |
| 1426 |
cocktailpeanut/dalai
The simplest way to run LLaMA on your local machine |
|
Emerging |
| 1427 |
ariannamethod/chuck.optimizer
Adam is blind. Chuck sees. Lee 4ever. |
|
Emerging |
| 1428 |
rasbt/pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage... |
|
Emerging |
| 1429 |
sshh12/llm_optimize
LLM Optimize is a proof-of-concept library for doing LLM (large language... |
|
Emerging |
| 1430 |
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention,... |
|
Emerging |
| 1431 |
lambdavi/SatDrive-SegFL
MLDL '23 Project: Federated Learning and Semantic Segmentation for... |
|
Emerging |
| 1432 |
IAmPara0x/Yuno
Yuno is context based search engine for anime. |
|
Emerging |
| 1433 |
icon-lab/BolT
Fused Window Transformers for fMRI Time Series Analysis... |
|
Emerging |
| 1434 |
efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs |
|
Emerging |
| 1435 |
AnkitNayak-eth/llmBench
llmBench is a high-depth benchmarking tool designed to measure the raw... |
|
Emerging |
| 1436 |
WhereIsAI/BiLLM
Tool for converting LLMs from uni-directional to bi-directional by removing... |
|
Emerging |
| 1437 |
argonne-lcf/LLM-Inference-Bench
LLM-Inference-Bench |
|
Emerging |
| 1438 |
stylellm/stylellm_models
StyleLLM文风大模型:基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language... |
|
Emerging |
| 1439 |
Nkluge-correa/TeenyTinyLlama
A pair of tiny foundational models trained in Brazilian Portuguese.🦙🦙 |
|
Emerging |
| 1440 |
otadk/nuxt-edge-ai
Nuxt module for local-first AI apps with server-side WASM inference via... |
|
Emerging |
| 1441 |
yueyu1030/AttrPrompt
[NeurIPS 2023] This is the code for the paper `Large Language Model as... |
|
Emerging |
| 1442 |
InnovatorLM/Innovator-VL
Fully Open-source Multimodal Language Models for Science Discovery |
|
Emerging |
| 1443 |
bwittmann/transoar
A 3D medical Detection Transformer library. Papers accepted @ MIDL22 & MELBA23/02. |
|
Emerging |
| 1444 |
ictnlp/BayLing
“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT... |
|
Emerging |
| 1445 |
Troyanovsky/llama-vision-image-tagger
Use Llama3.2 Vision for tagging and searching images on your local machine. |
|
Emerging |
| 1446 |
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers... |
|
Emerging |
| 1447 |
mohammadtavakoli78/BEAM
[ICLR 2026] Beyond a Million Tokens: Benchmarking and Enhancing Long-Term... |
|
Emerging |
| 1448 |
jellydn/gpt4all-cli
By utilizing GPT4All-CLI, developers can effortlessly tap into the power of... |
|
Emerging |
| 1449 |
kyegomez/HSSS
Implementation of a Hierarchical Mamba as described in the paper:... |
|
Emerging |
| 1450 |
luohongyin/LangCode
LangCode - Improving alignment and reasoning of large language models (LLMs)... |
|
Emerging |
| 1451 |
adarshM84/OpenTalkGptCode
A Chrome extension hosts an Ollama UI web server on localhost and other... |
|
Emerging |
| 1452 |
fangyuan-ksgk/Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text &... |
|
Emerging |
| 1453 |
ximinng/LLM4SVG
[CVPR 2025] Official implementation for "Empowering LLMs to Understand and... |
|
Emerging |
| 1454 |
opendatalab/UrBench
[AAAI 2025]This repo contains evaluation code for the paper “UrBench: A... |
|
Emerging |
| 1455 |
ccmdi/geobench
GeoGuessr benchmark for language models |
|
Emerging |
| 1456 |
CoderLSF/fast-llama
Runs LLaMA with Extremely HIGH speed |
|
Emerging |
| 1457 |
invergent-ai/surogate
Insanely fast LLM pre-training and fine-tuning for modern NVIDIA GPUs.... |
|
Emerging |
| 1458 |
loong64/llama.cpp
LLM inference in C/C++ |
|
Emerging |
| 1459 |
andrewkchan/yalm
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O |
|
Emerging |
| 1460 |
shinomakoi/AI-Messenger
A QT GUI for large language models |
|
Emerging |
| 1461 |
osainz59/t5-encoder
A extension of Transformers library to include T5ForSequenceClassification class. |
|
Emerging |
| 1462 |
conceptofmind/t5-pytorch
Implementation of Exploring the Limits of Transfer Learning with a Unified... |
|
Emerging |
| 1463 |
DeepLangAI/LingoWhale-8B
LingoWhale-8B: Open Bilingual LLMs | 开源双语预训练大模型 |
|
Emerging |
| 1464 |
coderonion/awesome-llm-and-aigc
🚀🚀🚀A collection of some awesome public projects about Large Language... |
|
Emerging |
| 1465 |
nrimsky/LM-exp
LLM experiments done during SERI MATS - focusing on activation steering /... |
|
Emerging |
| 1466 |
pier-maker92/bachsformer
A Bach music generator with Artificial Intelligence. This model is made by a... |
|
Emerging |
| 1467 |
kurakurai/Luth
Luth is a state-of-the-art series of fine-tuned LLMs for French |
|
Emerging |
| 1468 |
poloclub/Fine-tuning-LLMs
Finetune Llama 2 on Colab for free on your own data: step-by-step tutorial |
|
Emerging |
| 1469 |
openmedlab/PULSE
PULSE: Pretrained and Unified Language Service Engine |
|
Emerging |
| 1470 |
mkuchnik/relm
ReLM is a Regular Expression engine for Language Models |
|
Emerging |
| 1471 |
snapllm/snapllm
🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching |
|
Emerging |
| 1472 |
kyegomez/MegaVIT
The open source implementation of the model from "Scaling Vision... |
|
Emerging |
| 1473 |
ivonajdenkoska/tulip
[ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP" |
|
Emerging |
| 1474 |
avnlp/llm-blender
LLM-Blender: Ensembling framework that maximizes LLM performance via... |
|
Emerging |
| 1475 |
Eiztrips/ai-responder
инструмент для создания и обучения моделей, имитирующих стиль общения... |
|
Emerging |
| 1476 |
tranquoctrinh/transformer
This is a PyTorch implementation of the Transformer model in the paper... |
|
Emerging |
| 1477 |
justADeni/intel-npu-llm
A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs) |
|
Emerging |
| 1478 |
aws-samples/fine-tuning-llm-with-domain-knowledge
This repo walks you through how to use transfer learning to fine tune a LLM... |
|
Emerging |
| 1479 |
Bruce-Lee-LY/flash_attention_inference
Performance of the C++ interface of flash attention and flash attention v2... |
|
Emerging |
| 1480 |
SeungyounShin/Llama2-Code-Interpreter
Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet |
|
Emerging |
| 1481 |
adalkiran/llama-nuts-and-bolts
A holistic way of understanding how Llama and its components run in... |
|
Emerging |
| 1482 |
WangJingyao07/Awesome-GRPO
Codebase of GRPO: Implementations and Resources of GRPO and Its Variants |
|
Emerging |
| 1483 |
K024/llm-sharp
Language models in C# |
|
Emerging |
| 1484 |
vs4vijay/AI-Playground
All-in-One AI Playground for LLM, Chat, RAG, Agents, etc. |
|
Emerging |
| 1485 |
exasol/transformers-extension
An Exasol extension for using state-of-the-art pretrained machine learning... |
|
Emerging |
| 1486 |
leap-laboratories/PIZZA
An attribution library for LLMs |
|
Emerging |
| 1487 |
StarRing2022/ChatGPTX-Uni
实现一种多Lora权值集成切换+Zero-Finetune零微调增强的跨模型技术方案,LLM-Base+LLM-X+Alpaca,初期,LLM-Base为... |
|
Emerging |
| 1488 |
teelinsan/camoscio
Camoscio: An Italian instruction-tuned language model based on LLaMA |
|
Emerging |
| 1489 |
AlphaPav/mem-kk-logic
On Memorization of Large Language Models in Logical Reasoning |
|
Emerging |
| 1490 |
TIGER-AI-Lab/StructLM
Code and data for "StructLM: Towards Building Generalist Models for... |
|
Emerging |
| 1491 |
amazon-science/transformers-data-augmentation
Code associated with the "Data Augmentation using Pre-trained Transformer... |
|
Emerging |
| 1492 |
JinjieNi/MixEval
The official evaluation suite and dynamic data release for MixEval. |
|
Emerging |
| 1493 |
toriving/text-classification-transformers
Easy text classification for everyone : Bert based models via Huggingface... |
|
Emerging |
| 1494 |
monologg/KoELECTRA-Pipeline
Transformers Pipeline with KoELECTRA |
|
Emerging |
| 1495 |
TIGER-AI-Lab/VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of... |
|
Emerging |
| 1496 |
virtualramblas/Domain-Specific-Small-Language-Models
Repository for the companion Colab notebook of the Domain-Specific Small... |
|
Emerging |
| 1497 |
StupidTrees/SplitLLM
Split Learning Simulation Framework for LLMs |
|
Emerging |
| 1498 |
WANGXinyiLinda/concept-based-demonstration-selection
Offical code of the paper Large Language Models Are Implicitly Topic Models:... |
|
Emerging |
| 1499 |
jrobine/twm
Transformer-based World Models |
|
Emerging |
| 1500 |
michael-borck/study-buddy
Desktop AI tutoring app with local inference using Ollama for... |
|
Emerging |