rookiemann/multi-turboquant
Unified KV cache compression for LLM inference — TurboQuant, IsoQuant, PlanarQuant, TriAttention. 10 methods, GPU-validated, multi-GPU planner. Compress KV cache 5-80x to run bigger models, longer context, more agents on your GPU.
Stars
1
Forks
—
Language
Python
License
MIT
Last pushed
Apr 10, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/rookiemann/multi-turboquant"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
scikit-learn/scikit-learn
scikit-learn: machine learning in Python
pallets/click
Python composable command line interface toolkit
Farama-Foundation/Gymnasium
An API standard for single-agent reinforcement learning environments, with popular reference...
probabl-ai/skore
Track your Data Science. Skore's open-source Python library accelerates ML model development...
huggingface/evaluate
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.