Mixture Of Experts Llms Transformer Models
There are 19 mixture of experts llms models tracked. The highest-rated is EfficientMoE/MoE-Infinity at 43/100 with 288 stars.
Get all 19 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=mixture-of-experts-llms&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
EfficientMoE/MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models. |
|
Emerging |
| 2 |
jaisidhsingh/pytorch-mixtures
One-stop solutions for Mixture of Expert modules in PyTorch. |
|
Emerging |
| 3 |
raymin0223/mixture_of_recursions
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive... |
|
Emerging |
| 4 |
thu-nics/MoA
[CoLM'25] The official implementation of the paper |
|
Emerging |
| 5 |
AviSoori1x/makeMoE
From scratch implementation of a sparse mixture of experts language model... |
|
Emerging |
| 6 |
CASE-Lab-UMD/Unified-MoE-Compression
The official implementation of the paper "Towards Efficient Mixture of... |
|
Emerging |
| 7 |
MoonshotAI/MoBA
MoBA: Mixture of Block Attention for Long-Context LLMs |
|
Emerging |
| 8 |
ByteDance-Seed/FlexPrefill
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse... |
|
Emerging |
| 9 |
efeslab/fiddler
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration |
|
Emerging |
| 10 |
FareedKhan-dev/qwen3-MoE-from-scratch
A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch |
|
Emerging |
| 11 |
lliai/D2MoE
D^2-MoE: Delta Decompression for MoE-based LLMs Compression |
|
Emerging |
| 12 |
SkyworkAI/MoE-plus-plus
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with... |
|
Experimental |
| 13 |
dmis-lab/Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers |
|
Experimental |
| 14 |
CASE-Lab-UMD/Router-Tuning-Mixture-of-Depths
The open-source Mixture of Depths code and the official implementation of... |
|
Experimental |
| 15 |
cmu-flame/FLAME-MoE
Official repository for FLAME-MoE: A Transparent End-to-End Research... |
|
Experimental |
| 16 |
UNITES-Lab/HEXA-MoE
Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE... |
|
Experimental |
| 17 |
Spico197/MoE-SFT
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction... |
|
Experimental |
| 18 |
RoyZry98/T-REX-Pytorch
[Arxiv 2025] Official code for T-REX: Mixture-of-Rank-One-Experts with... |
|
Experimental |
| 19 |
zhongshsh/MoExtend
ACL 2024 (SRW), Official Codebase of our Paper: "MoExtend: Tuning New... |
|
Experimental |