Tencent/AngelSlim

Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.

85
/ 100
Verified

Supports multiple compression strategies—quantization algorithms (FP8, INT4, INT8, exotic formats like NVFP4 and 1.25-bit Sherry), speculative decoding frameworks (Eagle3, SpecExit), and pruning—across LLMs, vision-language models, and diffusion models. Built on a unified post-training quantization (PTQ) pipeline optimized for single-GPU operation on models up to 235B parameters. Integrates with Hugging Face and ModelScope ecosystems, with inference backends including vLLM and Torch.

536 stars and 5,117 monthly downloads. Actively maintained with 21 commits in the last 30 days. Available on PyPI.

Maintenance 23 / 25
Adoption 19 / 25
Maturity 24 / 25
Community 19 / 25

How are scores calculated?

Stars

536

Forks

68

Language

Python

License

Last pushed

Mar 12, 2026

Monthly downloads

5,117

Commits (30d)

21

Dependencies

13

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Tencent/AngelSlim"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.