Hassan-Sarwat/efficient-speculative-decoding

Improving both reasoning speed of LLM using Chain of Draft fine tuning and token output using Speculative Decoding

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

Stars

—

Forks

—

Language

Jupyter Notebook

License

—

Category

Last pushed

Feb 25, 2026

Commits (30d)

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Hassan-Sarwat/efficient-speculative-decoding"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

vitali87/speculant-graph

Graph drafts, LLM verifies: a novel speculative decoding framework

hsj576/GRIFFIN

Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative...

Hambaobao/HCP-Coder

Hierarchical Context Pruning (HCP): A strategy to optimize real-world code completion with...

levvius/adaptive-speculative-decoding

Adaptive speculative decoding for LLM inference latency optimization

hsj576/GTO

Official Implementation of "Bridging Draft Policy Misalignment: Group Tree Optimization for...