All Transformer Models

6,429 models ranked by quality score · Page 15 of 65

Showing 1401–1500 of 6,429
# Model Score Tier
1401 JosefAlbers/VL-JEPA

VL-JEPA (Vision-Language Joint Embedding Predictive Architecture) in MLX

38
Emerging
1402 zjukg/KoPA

[Paper][ACM MM 2024] Making Large Language Models Perform Better in...

38
Emerging
1403 nlpaueb/greek-bert

A Greek edition of BERT pre-trained language model

38
Emerging
1404 wenge-research/YAYI

雅意大模型:为客户打造安全可靠的专属大模型,基于大规模中英文多领域指令数据训练的 LlaMA 2 & BLOOM...

38
Emerging
1405 pmichel31415/are-16-heads-really-better-than-1

Code for the paper "Are Sixteen Heads Really Better than One?"

38
Emerging
1406 ahmetkumass/yolo-gen

Train YOLO + VLM with one command. Auto-generate vision-language training...

38
Emerging
1407 TrelisResearch/install-guides

Various installation guides for Large Language Models

38
Emerging
1408 SqueezeAILab/LLM2LLM

[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

38
Emerging
1409 ccdv-ai/convert_checkpoint_to_lsg

Efficient Attention for Long Sequence Processing

38
Emerging
1410 NisaarAgharia/Indian-LawyerGPT

Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model...

38
Emerging
1411 sayakpaul/probing-vits

Probing the representations of Vision Transformers.

38
Emerging
1412 taishan1994/LLM-Quantization

记录量化LLM中的总结。

38
Emerging
1413 NiuTrans/LMT

Building a inclusive, scalable, and high-performance multilingual translation model

38
Emerging
1414 chef-transformer/chef-transformer

Chef Transformer 🍲 .

38
Emerging
1415 EagleW/Scientific-Inspiration-Machines-Optimized-for-Novelty

Official implementation of the ACL 2024: Scientific Inspiration Machines...

38
Emerging
1416 biswassanket/DocSegTr

A Bottom-Up Instance Segmentation Strategy for segmenting document instances...

38
Emerging
1417 raimondilab/precogx

A predictor of GPCR couplings with G-proteins/B-arrs using Transformers

38
Emerging
1418 BaiTheBest/SparseLLM

Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)

38
Emerging
1419 jaco-bro/MLX.zig

MLX.zig: Phi-4, Llama 3.2, and Whisper in Zig

38
Emerging
1420 VITA-Group/LiGO

[ICLR 2023] "Learning to Grow Pretrained Models for Efficient Transformer...

38
Emerging
1421 sotiraslab/AgileFormer

This the repo for the paper tiltled "AgileFormer: Spatially Agile...

38
Emerging
1422 ashleykleynhans/text-generation-docker

Docker image for the Text Generation Web UI: A Gradio web UI for Large...

38
Emerging
1423 ymoslem/Adaptive-MT-LLM-Fine-tuning

Fine-tuning Open-Source LLMs for Adaptive Machine Translation

38
Emerging
1424 NVlabs/EoRA

[ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with...

38
Emerging
1425 zhanshijinwat/Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

38
Emerging
1426 cocktailpeanut/dalai

The simplest way to run LLaMA on your local machine

38
Emerging
1427 ariannamethod/chuck.optimizer

Adam is blind. Chuck sees. Lee 4ever.

38
Emerging
1428 rasbt/pytorch-memory-optim

This code repository contains the code used for my "Optimizing Memory Usage...

38
Emerging
1429 sshh12/llm_optimize

LLM Optimize is a proof-of-concept library for doing LLM (large language...

38
Emerging
1430 cmhungsteve/Awesome-Transformer-Attention

An ultimately comprehensive paper list of Vision Transformer/Attention,...

38
Emerging
1431 lambdavi/SatDrive-SegFL

MLDL '23 Project: Federated Learning and Semantic Segmentation for...

38
Emerging
1432 IAmPara0x/Yuno

Yuno is context based search engine for anime.

38
Emerging
1433 icon-lab/BolT

Fused Window Transformers for fMRI Time Series Analysis...

38
Emerging
1434 efeslab/Nanoflow

A throughput-oriented high-performance serving framework for LLMs

38
Emerging
1435 AnkitNayak-eth/llmBench

llmBench is a high-depth benchmarking tool designed to measure the raw...

38
Emerging
1436 WhereIsAI/BiLLM

Tool for converting LLMs from uni-directional to bi-directional by removing...

38
Emerging
1437 argonne-lcf/LLM-Inference-Bench

LLM-Inference-Bench

38
Emerging
1438 stylellm/stylellm_models

StyleLLM文风大模型:基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language...

38
Emerging
1439 Nkluge-correa/TeenyTinyLlama

A pair of tiny foundational models trained in Brazilian Portuguese.🦙🦙

38
Emerging
1440 otadk/nuxt-edge-ai

Nuxt module for local-first AI apps with server-side WASM inference via...

38
Emerging
1441 yueyu1030/AttrPrompt

[NeurIPS 2023] This is the code for the paper `Large Language Model as...

38
Emerging
1442 InnovatorLM/Innovator-VL

Fully Open-source Multimodal Language Models for Science Discovery

38
Emerging
1443 bwittmann/transoar

A 3D medical Detection Transformer library. Papers accepted @ MIDL22 & MELBA23/02.

38
Emerging
1444 ictnlp/BayLing

“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT...

38
Emerging
1445 Troyanovsky/llama-vision-image-tagger

Use Llama3.2 Vision for tagging and searching images on your local machine.

37
Emerging
1446 voidism/DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers...

37
Emerging
1447 mohammadtavakoli78/BEAM

[ICLR 2026] Beyond a Million Tokens: Benchmarking and Enhancing Long-Term...

37
Emerging
1448 jellydn/gpt4all-cli

By utilizing GPT4All-CLI, developers can effortlessly tap into the power of...

37
Emerging
1449 kyegomez/HSSS

Implementation of a Hierarchical Mamba as described in the paper:...

37
Emerging
1450 luohongyin/LangCode

LangCode - Improving alignment and reasoning of large language models (LLMs)...

37
Emerging
1451 adarshM84/OpenTalkGptCode

A Chrome extension hosts an Ollama UI web server on localhost and other...

37
Emerging
1452 fangyuan-ksgk/Mini-LLaVA

A minimal implementation of LLaVA-style VLM with interleaved image & text &...

37
Emerging
1453 ximinng/LLM4SVG

[CVPR 2025] Official implementation for "Empowering LLMs to Understand and...

37
Emerging
1454 opendatalab/UrBench

[AAAI 2025]This repo contains evaluation code for the paper “UrBench: A...

37
Emerging
1455 ccmdi/geobench

GeoGuessr benchmark for language models

37
Emerging
1456 CoderLSF/fast-llama

Runs LLaMA with Extremely HIGH speed

37
Emerging
1457 invergent-ai/surogate

Insanely fast LLM pre-training and fine-tuning for modern NVIDIA GPUs....

37
Emerging
1458 loong64/llama.cpp

LLM inference in C/C++

37
Emerging
1459 andrewkchan/yalm

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

37
Emerging
1460 shinomakoi/AI-Messenger

A QT GUI for large language models

37
Emerging
1461 osainz59/t5-encoder

A extension of Transformers library to include T5ForSequenceClassification class.

37
Emerging
1462 conceptofmind/t5-pytorch

Implementation of Exploring the Limits of Transfer Learning with a Unified...

37
Emerging
1463 DeepLangAI/LingoWhale-8B

LingoWhale-8B: Open Bilingual LLMs | 开源双语预训练大模型

37
Emerging
1464 coderonion/awesome-llm-and-aigc

🚀🚀🚀A collection of some awesome public projects about Large Language...

37
Emerging
1465 nrimsky/LM-exp

LLM experiments done during SERI MATS - focusing on activation steering /...

37
Emerging
1466 pier-maker92/bachsformer

A Bach music generator with Artificial Intelligence. This model is made by a...

37
Emerging
1467 kurakurai/Luth

Luth is a state-of-the-art series of fine-tuned LLMs for French

37
Emerging
1468 poloclub/Fine-tuning-LLMs

Finetune Llama 2 on Colab for free on your own data: step-by-step tutorial

37
Emerging
1469 openmedlab/PULSE

PULSE: Pretrained and Unified Language Service Engine

37
Emerging
1470 mkuchnik/relm

ReLM is a Regular Expression engine for Language Models

37
Emerging
1471 snapllm/snapllm

🔥 🔥 Alternative to Ollama 🔥 🔥 multi-model <1ms LLM switching

37
Emerging
1472 kyegomez/MegaVIT

The open source implementation of the model from "Scaling Vision...

37
Emerging
1473 ivonajdenkoska/tulip

[ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"

37
Emerging
1474 avnlp/llm-blender

LLM-Blender: Ensembling framework that maximizes LLM performance via...

37
Emerging
1475 Eiztrips/ai-responder

инструмент для создания и обучения моделей, имитирующих стиль общения...

37
Emerging
1476 tranquoctrinh/transformer

This is a PyTorch implementation of the Transformer model in the paper...

37
Emerging
1477 justADeni/intel-npu-llm

A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)

37
Emerging
1478 aws-samples/fine-tuning-llm-with-domain-knowledge

This repo walks you through how to use transfer learning to fine tune a LLM...

37
Emerging
1479 Bruce-Lee-LY/flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2...

37
Emerging
1480 SeungyounShin/Llama2-Code-Interpreter

Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet

37
Emerging
1481 adalkiran/llama-nuts-and-bolts

A holistic way of understanding how Llama and its components run in...

37
Emerging
1482 WangJingyao07/Awesome-GRPO

Codebase of GRPO: Implementations and Resources of GRPO and Its Variants

37
Emerging
1483 K024/llm-sharp

Language models in C#

37
Emerging
1484 vs4vijay/AI-Playground

All-in-One AI Playground for LLM, Chat, RAG, Agents, etc.

37
Emerging
1485 exasol/transformers-extension

An Exasol extension for using state-of-the-art pretrained machine learning...

37
Emerging
1486 leap-laboratories/PIZZA

An attribution library for LLMs

37
Emerging
1487 StarRing2022/ChatGPTX-Uni

实现一种多Lora权值集成切换+Zero-Finetune零微调增强的跨模型技术方案,LLM-Base+LLM-X+Alpaca,初期,LLM-Base为...

37
Emerging
1488 teelinsan/camoscio

Camoscio: An Italian instruction-tuned language model based on LLaMA

37
Emerging
1489 AlphaPav/mem-kk-logic

On Memorization of Large Language Models in Logical Reasoning

37
Emerging
1490 TIGER-AI-Lab/StructLM

Code and data for "StructLM: Towards Building Generalist Models for...

37
Emerging
1491 amazon-science/transformers-data-augmentation

Code associated with the "Data Augmentation using Pre-trained Transformer...

37
Emerging
1492 JinjieNi/MixEval

The official evaluation suite and dynamic data release for MixEval.

37
Emerging
1493 toriving/text-classification-transformers

Easy text classification for everyone : Bert based models via Huggingface...

37
Emerging
1494 monologg/KoELECTRA-Pipeline

Transformers Pipeline with KoELECTRA

37
Emerging
1495 TIGER-AI-Lab/VL-Rethinker

The official code of "VL-Rethinker: Incentivizing Self-Reflection of...

37
Emerging
1496 virtualramblas/Domain-Specific-Small-Language-Models

Repository for the companion Colab notebook of the Domain-Specific Small...

37
Emerging
1497 StupidTrees/SplitLLM

Split Learning Simulation Framework for LLMs

37
Emerging
1498 WANGXinyiLinda/concept-based-demonstration-selection

Offical code of the paper Large Language Models Are Implicitly Topic Models:...

37
Emerging
1499 jrobine/twm

Transformer-based World Models

37
Emerging
1500 michael-borck/study-buddy

Desktop AI tutoring app with local inference using Ollama for...

37
Emerging
« Prev 1 2 3 13 14 15 16 17 63 64 65 Next »