All Transformer Models

6,429 models ranked by quality score · Page 8 of 65

Showing 701–800 of 6,429
# Model Score Tier
701 RUCAIBox/TextBox

TextBox 2.0 is a text generation library with pre-trained language models

45
Emerging
702 ymcui/Chinese-LLaMA-Alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

45
Emerging
703 CMKRG/QiZhenGPT

QiZhenGPT: An Open Source Chinese Medical Large Language Model|一个开源的中文医疗大语言模型

45
Emerging
704 mit-han-lab/hardware-aware-transformers

[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

45
Emerging
705 eloialonso/iris

Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.

45
Emerging
706 LinkSoul-AI/Chinese-Llama-2-7b

开源社区第一个能下载、能运行的中文 LLaMA2 模型!

45
Emerging
707 AGI-Edgerunners/LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for...

45
Emerging
708 OpenMOSS/CoLLiE

Collaborative Training of Large Language Models in an Efficient Way

45
Emerging
709 romsto/Speculative-Decoding

Implementation of the paper Fast Inference from Transformers via Speculative...

45
Emerging
710 reasoning-survey/Awesome-Reasoning-Foundation-Models

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

45
Emerging
711 allenai/RL4LMs

A modular RL library to fine-tune language models to human preferences

45
Emerging
712 local-ai-zone/local-ai-zone.github.io

Discover the Best AI Models for Your PC

45
Emerging
713 alan-turing-institute/robots-in-disguise

Information and materials for the Turing's "robots-in-disguise" reading...

45
Emerging
714 Instruction-Tuning-with-GPT-4/GPT-4-LLM

Instruction Tuning with GPT-4

45
Emerging
715 ALucek/NeedleInAVidStack

Extract, timestamp, and analyze specific content from video collections...

45
Emerging
716 Muennighoff/vilio

🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle

45
Emerging
717 neurocard/neurocard

State-of-the-art neural cardinality estimators for join queries

45
Emerging
718 kyegomez/RT-2

Democratization of RT-2 "RT-2: New model translates vision and language into action"

45
Emerging
719 Tencent-Hunyuan/GradLoc

Implementation of GradLoc from the Tencent Hunyuan blog "Stabilizing RLVR...

45
Emerging
720 tae898/erc

The official implementation of "EmoBERTa: Speaker-Aware Emotion Recognition...

45
Emerging
721 ukairia777/tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림...

45
Emerging
722 ChangwenXu98/TransPolymer

Implementation of "TransPolymer: a Transformer-based language model for...

45
Emerging
723 baichuan-inc/Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

45
Emerging
724 kyegomez/Finetuning-Suite

Finetune any model on HF in less than 30 seconds

45
Emerging
725 arm-education/Advanced-AI-Hardware-Software-Co-Design

Hands-on course materials for ML engineers to master extreme model...

45
Emerging
726 cahya-wirawan/indonesian-language-models

Indonesian Language Models and its Usage

45
Emerging
727 AlignmentResearch/tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer

45
Emerging
728 GURPREETKAURJETHRA/Generative-AI-LLM-Projects

Gen AI Large Language Model Projects

45
Emerging
729 huggingface/awesome-huggingface

🤗 A list of wonderful open-source projects & applications integrated with...

45
Emerging
730 KittenCN/predict_Lottery_ticket_pytorch

pytorch下基于transformer / LSTM模型的彩票预测

45
Emerging
731 Lightning-Universe/lightning-transformers

Flexible components pairing 🤗 Transformers with :zap: Pytorch Lightning

45
Emerging
732 hugofloresgarcia/vampnet

music generation with masked transformers!

45
Emerging
733 kyegomez/USM

Implementation of Google's USM speech model in Pytorch

45
Emerging
734 Tzohar/PassLLM

World's most accurate password guessing AI tool. A PyTorch implementation of...

45
Emerging
735 sberbank-ai-lab/LightAutoML

LAMA - automatic model creation framework

45
Emerging
736 Bavest/fin-llama

LLAMA specialized on finance

45
Emerging
737 spcl/x1

Official Implementation of "Reasoning Language Models: A Blueprint"

44
Emerging
738 amanvirparhar/weebo

A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2,...

44
Emerging
739 flipkart-incubator/spark-transformers

Spark-Transformers: Library for exporting Apache Spark MLLIB models to use...

44
Emerging
740 YeonwooSung/ai_book

AI book for everyone

44
Emerging
741 jianzhnie/LLamaTuner

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen,...

44
Emerging
742 ariya/ask-llm

Interact with any LLM service

44
Emerging
743 xinzhanguo/hellollm

pre train a new llm

44
Emerging
744 andrewkchan/deepseek.cpp

CPU inference for the DeepSeek family of large language models in C++

44
Emerging
745 mbzuai-oryx/MobiLlama

[ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for...

44
Emerging
746 OPTML-Group/Unlearn-Simple

[NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative...

44
Emerging
747 ckiplab/ckip-transformers

CKIP Transformers

44
Emerging
748 pytorch/torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

44
Emerging
749 google-research/long-range-arena

Long Range Arena for Benchmarking Efficient Transformers

44
Emerging
750 amazon-science/tanl

Structured Prediction as Translation between Augmented Natural Languages

44
Emerging
751 refuel-ai/autolabel

Label, clean and enrich text datasets with LLMs.

44
Emerging
752 mounalab/Multivariate-time-series-forecasting-keras

This project provides implementations with Keras/Tensorflow of some deep...

44
Emerging
753 thuml/AutoTimes

Official implementation for "AutoTimes: Autoregressive Time Series...

44
Emerging
754 yesbhautik/Talk-with-PDF

An interactive AI chatbot for querying and discussing the contents of PDF...

44
Emerging
755 jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia):...

44
Emerging
756 yoniLc/ECCT

Error Correction Code Transformer

44
Emerging
757 zejia-lin/BulletServe

Boosting GPU utilization for LLM serving via dynamic spatial-temporal...

44
Emerging
758 brontoguana/krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger...

44
Emerging
759 CASIA-LMC-Lab/FLAP

[AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models

44
Emerging
760 interestingLSY/swiftLLM

A tiny yet powerful LLM inference system tailored for researching purpose....

44
Emerging
761 chengzeyi/ParaAttention

https://wavespeed.ai/ Context parallel attention that accelerates DiT model...

44
Emerging
762 invictus717/MetaTransformer

Meta-Transformer for Unified Multimodal Learning

44
Emerging
763 pairlab/SlotFormer

Code release for ICLR 2023 paper: SlotFormer on object-centric dynamics models

44
Emerging
764 Kaushalya/medclip

A multi-modal CLIP model trained on the medical dataset ROCO

44
Emerging
765 CouncilDataProject/speakerbox

Speakerbox: Fine-tune Audio Transformers for speaker identification.

44
Emerging
766 ictnlp/Stream-Omni

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that...

44
Emerging
767 ThinamXx/Transformers_NLP

The repository will contain a list of projects which we will work on while...

44
Emerging
768 Mann1988/awesome-claude-skills

📊 Explore high-quality Claude skills focused on business analysis and...

44
Emerging
769 ParthaPRay/LLM-Learning-Sources

This repo contains a list of channels and sources from where LLMs should be learned

44
Emerging
770 Trustworthy-ML-Lab/CB-LLMs

[ICLR 25] A novel framework for building intrinsically interpretable LLMs...

44
Emerging
771 QData/LaMP

ECML 2019: Graph Neural Networks for Multi-Label Classification

44
Emerging
772 4AI/LS-LLaMA

A Simple but Powerful SOTA NER Model | Official Code For Label Supervised...

44
Emerging
773 buaacyw/MeshAnything

[ICLR 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
774 buaacyw/MeshAnythingV2

[ICCV 2025] From anything to mesh like human artists. Official impl. of...

44
Emerging
775 open-compass/MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

44
Emerging
776 linjieli222/HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for...

44
Emerging
777 absadiki/pyllamacpp

Python bindings for llama.cpp

44
Emerging
778 MoonshotAI/MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

44
Emerging
779 iflytek/cino

CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)

44
Emerging
780 bhavsarpratik/serverless-transformers-on-aws-lambda

Deploy transformers serverless on AWS Lambda

44
Emerging
781 MIC-DKFZ/MedNeXt

[MICCAI 2023] MedNeXt is a fully ConvNeXt architecture for 3D medical image...

44
Emerging
782 HiThink-Research/MME-Finance

[MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning

44
Emerging
783 snap-research/EfficientFormer

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

44
Emerging
784 HUST-NingKang-Lab/MGM

MGM (Microbial General Model) as a large-scaled pretrained language model...

44
Emerging
785 kamalkraj/e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

44
Emerging
786 alohays/awesome-visual-representation-learning-with-transformers

Awesome Transformers (self-attention) in Computer Vision

44
Emerging
787 kyegomez/GPT4o

Community Open Source Implementation of GPT4o in PyTorch

44
Emerging
788 ZinYY/TreeLoRA

A pytorch implementation of the paper "TreeLoRA: Efficient Continual...

44
Emerging
789 bytedance/effective_transformer

Running BERT without Padding

44
Emerging
790 Aratako/T5Gemma-TTS

Multilingual TTS model with voice cloning and duration control, based on...

44
Emerging
791 IntelLabs/causality-lab

Causal discovery algorithms and tools for implementing new ones

44
Emerging
792 xrsrke/pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of...

44
Emerging
793 dccuchile/beto

BETO - Spanish version of the BERT model

44
Emerging
794 HKUDS/LightReasoner

"LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?"

44
Emerging
795 sgrvinod/chess-transformers

Teaching transformers to play chess

44
Emerging
796 JAMESYJL/ShapeLLM-Omni

[NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding

44
Emerging
797 Victorwz/LongMem

Official implementation of our NeurIPS 2023 paper "Augmenting Language...

44
Emerging
798 HugAILab/HugNLP

CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP...

44
Emerging
799 ymcui/MacBERT

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

44
Emerging
800 smalltong02/keras-llm-robot

A web UI Project In order to learn the large language model. This project...

44
Emerging
« Prev 1 2 3 6 7 8 9 10 63 64 65 Next »