torchspec-project/TorchSpec

A PyTorch native library for training speculative decoding models

40
/ 100
Emerging

Decouples inference and training via a disaggregated pipeline that streams hidden states from vLLM or SGLang inference engines to distributed training workers through Mooncake's in-memory store, enabling independent scaling of each component. Integrates directly with PyTorch FSDP for distributed training, uses vLLM's Worker Extension API to avoid RPC serialization overhead, and supports vocabulary pruning with HuggingFace checkpoint conversion. Includes production examples for Qwen3, Kimi-K2.5, and MiniMax-M2.5 models with configurable training modes for resuming interrupted runs or continual training from existing weights.

No Package No Dependents
Maintenance 13 / 25
Adoption 7 / 25
Maturity 11 / 25
Community 9 / 25

How are scores calculated?

Stars

32

Forks

3

Language

Python

License

MIT

Last pushed

Mar 11, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/torchspec-project/TorchSpec"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.