All Transformer Models

6,429 models ranked by quality score · Page 21 of 65

Showing 2001–2100 of 6,429
# Model Score Tier
2001 myscience/x-lstm

Pytorch implementation of the xLSTM model by Beck et al. (2024)

32
Emerging
2002 vivy-yi/awesome-llm-training-inference

Curated list of LLM training and inference frameworks, tools, and resources....

32
Emerging
2003 nareshis21/Truelarge-RT

Android inference engine running 20B+ parameter LLMs on 4GB-8GB RAM devices....

32
Emerging
2004 microsoft/encoder-decoder-slm

Efficient encoder-decoder architecture for small language models (≤1B...

32
Emerging
2005 mdegans/drama_llama

Yet another `llama.cpp` Rust wrapper

32
Emerging
2006 yaph/charla

A terminal based chat application that works with AI language models.

32
Emerging
2007 QwenLM/ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

32
Emerging
2008 Simplifine-gamedev/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud...

32
Emerging
2009 urban-mobility-generation/Language-Modeling-for-Urban-Mobility

Language Modeling for Urban Mobility: A Data-Centric Review and Guidelines

32
Emerging
2010 CogitatorTech/zigformer

An educational transformer-based LLM in pure Zig

32
Emerging
2011 rhnfzl/SqueakyCleanText

Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble,...

32
Emerging
2012 luffycodes/Tutorbot-Spock

An Education Tutoring Chatbot based on Learning Science Principles powered...

32
Emerging
2013 yaodongC/awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs...

32
Emerging
2014 LorenzoAgnolucci/BERT_for_ABSA

In this work (Targeted) Aspect-Based Sentiment Analysis task is converted to...

32
Emerging
2015 datawhalechina/unlock-hf

解锁HuggingFace生态的百般用法

32
Emerging
2016 HeegyuKim/language-model

한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)

32
Emerging
2017 Mmorgan-ML/Neuromodulatory-Control-Networks

Neuromodulatory Control Networks (NCNs), a novel LLM architectural...

32
Emerging
2018 Koratahiu/Advanced_Optimizers

A family of highly efficient, lightweight yet powerful optimizers.

32
Emerging
2019 zabir-nabil/awesome-multilingual-large-language-models

A comprehensive collection of multilingual datasets and large language...

32
Emerging
2020 takara-ai/SwarmFormer

A pytorch implementation of SwarmFormer for text classification.

32
Emerging
2021 armbues/SiLLM-examples

Examples for using the SiLLM framework for training and running Large...

32
Emerging
2022 sayakpaul/deploy-hf-tf-vision-models

This repository shows various ways of deploying a vision model (TensorFlow)...

32
Emerging
2023 seongminp/transformers-into-vaes

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

32
Emerging
2024 mshenoda/roberta-spam

RoBERTa based Spam Message Detection

32
Emerging
2025 rezazad68/TMUnet

Contextual Attention Network: Transformer Meets U-Net

32
Emerging
2026 liuyang-ict/SAP-DETR

[CVPR 2023] Official implementation of "SAP-DETR: Bridging the Gap between...

32
Emerging
2027 XunshanMan/MVGFormer

This is the official implementation of the work presented at CVPR 2024,...

32
Emerging
2028 clip-italian/clip-italian

CLIP (Contrastive Language–Image Pre-training) for Italian

32
Emerging
2029 AILab-CVC/M2PT

[CVPR 2024] Multimodal Pathway: Improve Transformers with Irrelevant Data...

32
Emerging
2030 oxidized-transformers/oxidized-transformers

Modular Rust transformer/LLM library using Candle

32
Emerging
2031 NSLab-CUK/Unified-Graph-Transformer

Unified Graph Transformer (UGT) is a novel Graph Transformer model...

32
Emerging
2032 tsinghua-fib-lab/UniST

Official implementation for "UniST: A Prompt-Empowered Universal Model for...

32
Emerging
2033 sanjaradylov/smiles-gpt

Generative Pre-Training from Molecules

32
Emerging
2034 remixer-dec/botality-ii

telegram bot for self-hosted local inference of stable diffusion,...

32
Emerging
2035 SapienzaNLP/ita-bench

A collection of Italian benchmarks for LLM evaluation

32
Emerging
2036 BFCmath/FinetuneAI_Learning

How to effectively finetune CV/LLM models (without local gpu)

32
Emerging
2037 CJReinforce/PURE

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is...

32
Emerging
2038 soumyadip1995/BabyGPT

Something in the middle of Karpathy's mingpt model and video lectures, ...

32
Emerging
2039 yinzhangyue/SelfAware

Do Large Language Models Know What They Don’t Know?

32
Emerging
2040 Nota-NetsPresso/shortened-llm

Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]

32
Emerging
2041 GunjanDhanuka/stocks-trading-bot

A multi-purpose repository with Sentiment Analysis of Stocks news, and...

32
Emerging
2042 Buyun-Liang/SECA

[NeurIPS 2025] SECA: Semantically Equivalent and Coherent Attacks for...

32
Emerging
2043 rti/gptvis

Understanding Transformers Using A Minimal Example

32
Emerging
2044 fvliang/DART

Official Implementation of DART (DART: Diffusion-Inspired Speculative...

32
Emerging
2045 ShiZhengyan/InstructionModelling

[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With...

32
Emerging
2046 albrateanu/ModalFormer

[2025] ModalFormer: Multimodal Transformer for Low-Light Image Enhancement

32
Emerging
2047 katha-ai/EmoTx-CVPR2023

[CVPR 2023] Official code repository for "How you feelin'? Learning Emotions...

32
Emerging
2048 leondz/lm_risk_cards

Risks and targets for assessing LLMs & LLM vulnerabilities

32
Emerging
2049 BillChan226/HALC

[ICML 2024] Official implementation for "HALC: Object Hallucination...

32
Emerging
2050 RishabSA/Sketch2Graphviz

Sketch2Graphviz allows you to convert sketches or images of graphs and...

32
Emerging
2051 robert-mcdermott/ollama-batch-cluster

Large Scale Batch Processing with Ollama

32
Emerging
2052 praj2408/Text-Summarizer-Project

The text summarizer project is an innovative tool designed to condense...

32
Emerging
2053 FutureComputing4AI/HGConv

HGConv: Holographic Global Convolutional Networks

32
Emerging
2054 amazon-science/text_generation_diffusion_llm_topic

Topic Embedding, Text Generation and Modeling using diffusion

32
Emerging
2055 IDSIA/automated-cl

Official repository for the paper "Automating Continual Learning"

32
Emerging
2056 IDSIA/lmtool-fwp

PyTorch Language Modeling Toolkit for Fast Weight Programmers

32
Emerging
2057 waltonfuture/InstructionGPT-4

InstructionGPT-4

32
Emerging
2058 laelhalawani/gguf_llama

Wrapper for simplified use of Llama2 GGUF quantized models.

32
Emerging
2059 umitkacar/pytorch-interactive-learning

Professional PyTorch CLI learning tool with 24 comprehensive lessons - From...

32
Emerging
2060 yangjianxin1/Firefly-LLaMA2-Chinese

Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、Intern...

32
Emerging
2061 wangxiao5791509/MultiModal_BigModels_Survey

[MIR-2023-Survey] A continuously updated paper list for multi-modal...

32
Emerging
2062 Nithin-Holla/meme_challenge

Repository containing code from team Kingsterdam for the Hateful Memes Challenge

32
Emerging
2063 StyrbjornKall/TRIDENT

A collection of transformer-based models and developmental scripts presented...

32
Emerging
2064 The-Swarm-Corporation/Hyena-Y

A PyTorch implementation of the Hyena-Y model, a convolution-based...

32
Emerging
2065 templetwo/PhaseGPT

Kuramoto Phase-Coupled Oscillator Attention in Transformers

32
Emerging
2066 naokishibuya/simple_transformer

A Transformer Implementation that is easy to understand and customizable.

32
Emerging
2067 nihalsangeeth/behaviour-seq-transformer

Pytorch implementation of "Behaviour Sequence Transformer for E-commerce...

32
Emerging
2068 cyk1337/Transformer-in-PyTorch

Transformer/Transformer-XL/R-Transformer examples and explanations

32
Emerging
2069 yueliu1999/FlipAttack

[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs...

32
Emerging
2070 yao8839836/kg-llm

Exploring large language models for knowledge graph completion. ICASSP 2025

32
Emerging
2071 anyantudre/Florence-2-Vision-Language-Model

Florence-2 is a novel vision foundation model with a unified, prompt-based...

32
Emerging
2072 fran-martinez/bio_ner_bert

BERT finetuned on NER downstream tasks

32
Emerging
2073 rasbt/faster-pytorch-blog

Outlining techniques for improving the training performance of your PyTorch...

32
Emerging
2074 mlverse/mall

Run multiple LLM predictions against a data frame with R and Python

32
Emerging
2075 mytechnotalent/falcongpt

Simple GPT app that uses the falcon-7b-instruct model with a Flask front-end.

32
Emerging
2076 aquadzn/deploy-transformers

Easily deploy a state-of-the-art language model from HuggingFace's Transformers

32
Emerging
2077 BorealisAI/flora-opt

This is the official repository for the paper "Flora: Low-Rank Adapters Are...

32
Emerging
2078 alvion427/PerroPastor

Run Llama based LLMs in Unity entirely in compute shaders with no dependencies

32
Emerging
2079 midway2333/Tower2

多模态语言模型架构

32
Emerging
2080 zubair-irshad/NeRF-MAE

[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders...

32
Emerging
2081 dsdanielpark/open-llm-datasets

Repository for organizing datasets and papers used in Open LLM.

32
Emerging
2082 SafeAILab/RAIN

[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning

32
Emerging
2083 deep-symbolic-mathematics/llm-srbench

[ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation...

32
Emerging
2084 HKUNLP/efficient-attention

[EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control...

32
Emerging
2085 synlp/R2-LLM

The official GitHub repository of the AAAI-2024 paper "Bootstrapping Large...

32
Emerging
2086 azminewasi/Awesome-LLMs-ICLR-24

It is a comprehensive resource hub compiling all LLM papers accepted at the...

32
Emerging
2087 open-compass/ANAH

[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO

32
Emerging
2088 mohd-faizy/06P_Sentiment-Analysis-With-Deep-Learning-Using-BERT

Finetuning BERT in PyTorch for sentiment analysis.

32
Emerging
2089 The-Martyr/CausalMM

[ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal...

32
Emerging
2090 smvorwerk/xlstm-cuda

Cuda implementation of Extended Long Short Term Memory (xLSTM) with C++ and...

32
Emerging
2091 Qwen-Applications/CLIPO

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

32
Emerging
2092 SalesforceAIResearch/Elastic-Reasoning

Make reasoning models scalable

32
Emerging
2093 arshadshk/Last_Query_Transformer_RNN-PyTorch

Implementation of the paper "Last Query Transformer RNN for knowledge...

32
Emerging
2094 anas-zafar/LLM-Survey

The official GitHub page for the survey paper "A Survey on Large Language...

32
Emerging
2095 fajri91/sum_liputan6

The first large-scale summarization corpus for the Indonesian language. AACL 2020.

32
Emerging
2096 cmu-flame/FLAME-MoE

Official repository for FLAME-MoE: A Transparent End-to-End Research...

32
Emerging
2097 ymoslem/Adaptive-MT-LLM

Adaptive Machine Translation with Large Language Models

32
Emerging
2098 HxCodeWarrior/StellarByte

从零实现基础的Transformer的Decoerder-Only模型,并进行模型升级,构建专属于自己的LLM模型

32
Emerging
2099 TIGER-AI-Lab/TIGERScore

"TIGERScore: Towards Building Explainable Metric for All Text Generation...

32
Emerging
2100 xmartlabs/spoter-embeddings

Create embeddings from sign pose videos using Transformers

32
Emerging
« Prev 1 2 3 19 20 21 22 23 63 64 65 Next »