SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Employs speculative decoding via a lightweight draft model that extrapolates multi-level semantic features (low, mid, and high-layer) from the base LLM, enabling 5.6x speedup on 13B models while maintaining output distribution equivalence. Progressively evolved across three versions—EAGLE-1 uses second-layer feature extrapolation, EAGLE-2 adds confidence-based dynamic tree adjustment, and EAGLE-3 removes feature prediction constraints through training-time simulation. Integrates with major LLM serving frameworks including vLLM, TensorRT-LLM, SGLang, and AMD ROCm, with trainable checkpoints available for Llama, Vicuna, Qwen, and DeepSeek models.
2,213 stars.
Stars
2,213
Forks
260
Language
Python
License
—
Category
Last pushed
Feb 20, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/SafeAILab/EAGLE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
sgl-project/SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
structuredllm/syncode
Efficient and general syntactical decoding for Large Language Models
romsto/Speculative-Decoding
Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan...
torchspec-project/TorchSpec
A PyTorch native library for training speculative decoding models
hao-ai-lab/JacobiForcing
Jacobi Forcing: Fast and Accurate Diffusion-style Decoding