SafeAILab/EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

/ 100

Established

Employs speculative decoding via a lightweight draft model that extrapolates multi-level semantic features (low, mid, and high-layer) from the base LLM, enabling 5.6x speedup on 13B models while maintaining output distribution equivalence. Progressively evolved across three versions—EAGLE-1 uses second-layer feature extrapolation, EAGLE-2 adds confidence-based dynamic tree adjustment, and EAGLE-3 removes feature prediction constraints through training-time simulation. Integrates with major LLM serving frameworks including vLLM, TensorRT-LLM, SGLang, and AMD ROCm, with trainable checkpoints available for Llama, Vicuna, Qwen, and DeepSeek models.

2,213 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 21 / 25

How are scores calculated?

Stars

2,213

Forks

260

Language

Python

License

—

Related models

sgl-project/SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

structuredllm/syncode

Efficient and general syntactical decoding for Large Language Models

romsto/Speculative-Decoding

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan...

torchspec-project/TorchSpec

A PyTorch native library for training speculative decoding models

hao-ai-lab/JacobiForcing

Jacobi Forcing: Fast and Accurate Diffusion-style Decoding

Explore Transformer Models

All categories Trending Transformer directory Insights