All Transformer Models

6,429 models ranked by quality score · Page 25 of 65

Showing 2401–2500 of 6,429
# Model Score Tier
2401 sayakpaul/keras-xla-benchmarks

Presents comprehensive benchmarks of XLA-compatible pre-trained models in Keras.

29
Experimental
2402 eslambakr/LAR-Look-Around-and-Refer

This is the official implementation for our paper;"LAR:Look Around and Refer".

29
Experimental
2403 gabe00122/jaxrl

Partially Observable Multi-Agent RL with Transformers

29
Experimental
2404 tthinking/MATR

[IEEE TIP 2022] Official implementation of MATR: Multimodal Medical Image...

29
Experimental
2405 sauradip/STALE

[ECCV 2022] Official Pytorch Implementation of the paper : " Zero-Shot...

29
Experimental
2406 msakarvadia/memorization

Localizing Memorized Sequences in Language Models

29
Experimental
2407 WooooDyy/LLM-Reverse-Curriculum-RL

Implementation of the ICML 2024 paper "Training Large Language Models for...

29
Experimental
2408 zer0int/CLIP-fine-tune-registers-gated

Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...

29
Experimental
2409 abdur75648/V-Zen

V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel...

29
Experimental
2410 Victorwz/MLM_Filter

Official implementation of our paper "Finetuned Multimodal Language Models...

29
Experimental
2411 alphasecio/llama-guard

A web app for exploring content moderation with Llama Guard on Groq.

29
Experimental
2412 xvyaward/owq

Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization...

29
Experimental
2413 hristijanpeshov/SHAP-Explainable-Lexicon-Model

This project proposes a novel methodology to automatically learn financial...

29
Experimental
2414 nehalvaghasiya/interview-bot

AI-powered virtual interview bot to simulate real interview practice.

29
Experimental
2415 AshishGautamX/K8s-LLM-Scheduler

An intelligent Kubernetes scheduler powered by Meta's Llama-3.3-70B model...

29
Experimental
2416 moeru-ai/demodel

🚀🛸 Easily boost the speed of pulling your models and datasets from various...

29
Experimental
2417 gunnarnordqvist/opencode-context-filter

Transparent HTTP proxy that automatically filters repository context for...

29
Experimental
2418 markendo/downscaling_intelligence

Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in...

29
Experimental
2419 abhilashreddys/Fake-News-Article

Detecting fake news articles by analyzing patterns in writing.

29
Experimental
2420 AUCOHL/RTL-Repo

RTL-Repo: A Benchmark for Evaluating LLMs on Large-Scale RTL Design Projects...

29
Experimental
2421 allenai/staged-training

Staged Training for Transformer Language Models

29
Experimental
2422 dvgodoy/LLM-visuals

Over 60 figures and diagrams of LLMs, quantization, low-rank adapters...

29
Experimental
2423 jaketae/param-share-transformer

PyTorch implementation of Lessons on Parameter Sharing across Layers in Transformers

29
Experimental
2424 markusaksli/ai-music

A vanilla Trasformer Decoder music generation model trained on Final Fantasy...

29
Experimental
2425 dpressel/mint

MinT: Minimal Transformer Library and Tutorials

29
Experimental
2426 llm-semantic-router/vllm-router

vLLM Router

29
Experimental
2427 zchuz/TimeBench

The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of...

29
Experimental
2428 Azure/nlp-samples

Japanese NLP sample codes

29
Experimental
2429 ChartMimic/ChartMimic

[ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability...

29
Experimental
2430 WisconsinAIVision/YoLLaVA

🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)

29
Experimental
2431 2toinf/IVM

[NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"

29
Experimental
2432 dependentsign/Awesome-LLM-based-Evaluators

✨✨Latest Papers about LLM-based Evaluators

29
Experimental
2433 m-horky/sllm

Tools using small Large Language Models

29
Experimental
2434 dreamingjudith/KoGPT2-personachat

Fine-tuned KoGPT2 chatbot demo with translated PersonaChat (ongoing)

29
Experimental
2435 ariya/query-llm

Query LLM with Chain-of-Tought

29
Experimental
2436 vorobeevich/ml-snippets-classification

The source code of "Machine learning code snippets semantic classification"...

29
Experimental
2437 suamin/T2NER

T2NER: Transformers based Transfer Learning Framework for Named Entity...

29
Experimental
2438 PeterGriffinJin/Patton

Patton: Language Model Pretraining on Text-rich Networks (ACL 2023 main oral)

29
Experimental
2439 ai-art-dev99/llm-from-scratch

Build a Large Language Model From Scratch

29
Experimental
2440 zengqunzhao/Exp-CLIP

[WACV'25 Oral] Enhancing Zero-Shot Facial Expression Recognition by LLM...

29
Experimental
2441 Adriankhl/godot-llm-template

Godot LLM Template/Demo

29
Experimental
2442 CopperEagle/SmartFileLibrary

SmartFileLibrary is an AI-supported digital library, backed by a local...

29
Experimental
2443 varchasvee108/vision-transformer-maze-agent

Vision Transformer agent that learns to navigate mazes while visualizing...

29
Experimental
2444 JRC1995/BERT-Disaster-Classification-Capsule-Routing

Exploration of BERT-BiLSTM models with Layer Aggregation (attention-based...

29
Experimental
2445 CoffeeVampir3/ez-trainer

Train Llama Loras Easily

29
Experimental
2446 HasanBGit/KSAA2026-Fine-Tashkeel

Official code for "Fine-Tashkeel at KSAA-2026" — Systematic evaluation of 18...

29
Experimental
2447 liziniu/ReMax

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement...

29
Experimental
2448 kazuki-irie/kv-memory-brain

Official Code Repository for the paper "Key-value memory in the brain"

29
Experimental
2449 crux82/BISS-2024

This repository hosts materials from the Bertinoro International Spring...

29
Experimental
2450 Baran-phys/Tropical-Attention

[NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic...

29
Experimental
2451 bvanaken/visbert

VisBERT: Demo web app for "How Does BERT Answer Questions?"

29
Experimental
2452 bminixhofer/zett

Code for Zero-Shot Tokenizer Transfer

29
Experimental
2453 clint-kristopher-morris/llm-guided-evolution

LLM Guided Evolution - The Automation of Models Advancing Models

29
Experimental
2454 aju22/LLaMA2

This repository contains an implementation of the LLaMA 2 (Large Language...

29
Experimental
2455 wjn1996/HugNLP

HugNLP is a unified and comprehensive NLP library based on HuggingFace...

28
Experimental
2456 ntt-dkiku/route-explainer

The official implementation of "RouteExplainer: An Explanation Framework for...

28
Experimental
2457 Mya-Mya/CBF-LLM

"CBF-LLM: Safe Control for LLM Alignment"

28
Experimental
2458 liziniu/GEM

Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large...

28
Experimental
2459 BenChaliah/NVFP4-on-4090-vLLM

AdaLLM is an NVFP4-first inference runtime for Ada Lovelace (RTX 4090) with...

28
Experimental
2460 JerryYLi/valhalla-nmt

Code repository for CVPR 2022 paper "VALHALLA: Visual Hallucination for...

28
Experimental
2461 TheBrainLab/SGLFormer

Spiking Global-Local Fusion Transformer

28
Experimental
2462 SJTU-DENG-Lab/LightningRL

LightningRL: Breaking the Accuracy–Parallelism Trade-off of Block-wise dLLMs...

28
Experimental
2463 czg1225/VeriThinker

[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient

28
Experimental
2464 zjunlp/ModelKinship

Exploring Model Kinship for Merging Large Language Models

28
Experimental
2465 RenzeLou/Muffin

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

28
Experimental
2466 sichunluo/RecRanker

[TOIS'24] "RecRanker: Instruction Tuning Large Language Model as Ranker for...

28
Experimental
2467 SJTU-IPADS/Bamboo

Bamboo-7B Large Language Model

28
Experimental
2468 SCZwangxiao/RTQ-MM2023

ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding...

28
Experimental
2469 zjunlp/NLPCC2024_RegulatingLLM

[NLPCC 2024] Shared Task 10: Regulating Large Language Models

28
Experimental
2470 asigalov61/Orchestrator

Local windowed attention multi-instrumental music transformer tailored for...

28
Experimental
2471 UIC-InDeXLab/RSR

An Efficient Matrix Multiplication Algorithm for Accelerating Inference in...

28
Experimental
2472 alexliap/greek_gpt

MoE Decoder Transformer implementation with MLX

28
Experimental
2473 moharamfatema/graduation-project

Video vision transformers for hierarchical anomaly detection in video scenes.

28
Experimental
2474 Qwen-Applications/STAR

STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function...

28
Experimental
2475 OSU-NLP-Group/QA4RE

[ACL'23 Findings] "Aligning Instruction Tasks Unlocks Large Language Models...

28
Experimental
2476 declare-lab/Auto-Scaling

[Arxiv 2024] Official Implementation of the paper: "Towards Robust...

28
Experimental
2477 TamSiuhin/P2P

source code for "Instant Personalized Large Language Model Adaptation via...

28
Experimental
2478 deepmancer/vlm-toolbox

Vision-Language Models Toolbox: Your all-in-one solution for multimodal...

28
Experimental
2479 hemangjoshi37a/hjAlgos

AI based algorithmic trading platform for zerodha users

28
Experimental
2480 PKU-Alignment/aligner

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

28
Experimental
2481 jorgemunozl/Finetunning-Llama-Vision-11b

Inference and finnetunning of a VLM (LLama Vision 11b) using the Unsloth,...

28
Experimental
2482 matt-k-wong/mlx-flash

Lightning-fast MLX utilities and optimizations for Apple Silicon

28
Experimental
2483 astrobleem/Simple-StableLM-Chat

This is a very simple python app that you can use to get up and chatting...

28
Experimental
2484 GiorgiaAuroraAdorni/gansformer-reproducibility-challenge

Replication of the novel Generative Adversarial Transformer.

28
Experimental
2485 fuglede/llama.ttf

A font for writing tiny stories

28
Experimental
2486 andrewliao11/LongPerceptualThoughts

[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling...

28
Experimental
2487 Zishan-Shao/FlashSVD

Welcome to the FlashSVD, an activation aware inference system for SVD-based...

28
Experimental
2488 oshindutta/TVAprune

[ICML 2024 Es-FoMo] - Efficient LLM Pruning with Global Token-Dependency...

28
Experimental
2489 lfunderburk/automate-tech-post

LLM application: fine tuned model to generate social media posts from...

28
Experimental
2490 logic-OT/BobVLM

BobVLM – A 1.5B multimodal model built from scratch and pre-trained on a...

28
Experimental
2491 DavidValin/ai-mate

ai mate is a terminal based audio conversation system between a user and AI models

28
Experimental
2492 achimoraites/machine-learning-playground

Having fun with ML

28
Experimental
2493 JIA-Lab-research/Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for...

28
Experimental
2494 dhruvdcoder/xlm-core

XLM is a modular, research-friendly framework for developing and comparing...

28
Experimental
2495 KarthikSriramGit/H.E.I.M.D.A.L.L

H.E.I.M.D.A.L.L looks at fleet telemetry and gives you natural-language...

28
Experimental
2496 winstxnhdw/llm-api

A fast CPU-based API for Qwen 2.5 using CTranslate2, hosted on Hugging Face Spaces.

28
Experimental
2497 jofaval/tfm-iabd

Master's Final Degree Project on Artificial Intelligence and Big Data

28
Experimental
2498 maxi-w/llama2-chat-interface

Gradio Chat Interface for Llama 2

28
Experimental
2499 andreiramani/jadi4llamacpp

Just another drop in for llama.cpp

28
Experimental
2500 alxfgh/Large-Language-Models-in-Chemistry

Working collection of papers, repos and models of transformer based language...

28
Experimental
« Prev 1 2 3 23 24 25 26 27 63 64 65 Next »