Mixture Of Experts Llms Transformer Models

There are 19 mixture of experts llms models tracked. The highest-rated is EfficientMoE/MoE-Infinity at 43/100 with 288 stars.

Get all 19 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=mixture-of-experts-llms&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	EfficientMoE/MoE-Infinity PyTorch library for cost-effective, fast and easy serving of MoE models.	43	Emerging	288	Python
2	jaisidhsingh/pytorch-mixtures One-stop solutions for Mixture of Expert modules in PyTorch.	42	Emerging	27	Python
3	raymin0223/mixture_of_recursions Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive...	41	Emerging	548	Python
4	thu-nics/MoA [CoLM'25] The official implementation of the paper	39	Emerging	156	Python
5	AviSoori1x/makeMoE From scratch implementation of a sparse mixture of experts language model...	39	Emerging	793	Jupyter Notebook
6	CASE-Lab-UMD/Unified-MoE-Compression The official implementation of the paper "Towards Efficient Mixture of...	37	Emerging	89	Python
7	MoonshotAI/MoBA MoBA: Mixture of Block Attention for Long-Context LLMs	37	Emerging	2,076	Python
8	ByteDance-Seed/FlexPrefill Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse...	35	Emerging	164	Python
9	efeslab/fiddler [ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration	35	Emerging	262	Python
10	FareedKhan-dev/qwen3-MoE-from-scratch A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch	33	Emerging	76	Jupyter Notebook
11	lliai/D2MoE D^2-MoE: Delta Decompression for MoE-based LLMs Compression	30	Emerging	74	Python
12	SkyworkAI/MoE-plus-plus [ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with...	29	Experimental	264	Python
13	dmis-lab/Monet [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers	28	Experimental	76	Python
14	CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths The open-source Mixture of Depths code and the official implementation of...	27	Experimental	28	Python
15	cmu-flame/FLAME-MoE Official repository for FLAME-MoE: A Transparent End-to-End Research...	26	Experimental	33	Jupyter Notebook
16	UNITES-Lab/HEXA-MoE Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE...	17	Experimental	15	Python
17	Spico197/MoE-SFT 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction...	16	Experimental	41	Python
18	RoyZry98/T-REX-Pytorch [Arxiv 2025] Official code for T-REX: Mixture-of-Rank-One-Experts with...	14	Experimental	17	Python
19	zhongshsh/MoExtend ACL 2024 (SRW), Official Codebase of our Paper: "MoExtend: Tuning New...	14	Experimental	14	Python