Hassan-Sarwat/efficient-speculative-decoding
Improving both reasoning speed of LLM using Chain of Draft fine tuning and token output using Speculative Decoding
Stars
—
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Hassan-Sarwat/efficient-speculative-decoding"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vitali87/speculant-graph
Graph drafts, LLM verifies: a novel speculative decoding framework
hsj576/GRIFFIN
Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative...
Hambaobao/HCP-Coder
Hierarchical Context Pruning (HCP): A strategy to optimize real-world code completion with...
levvius/adaptive-speculative-decoding
Adaptive speculative decoding for LLM inference latency optimization
hsj576/GTO
Official Implementation of "Bridging Draft Policy Misalignment: Group Tree Optimization for...