mit-han-lab/TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

45
/ 100
Emerging

Supports both LLMs and vision language models (VLMs) through AWQ and SmoothQuant quantization techniques, enabling 4-bit inference with minimal accuracy loss. Built as zero-dependency C/C++ for cross-platform compatibility—x86, ARM (M1/M2, Raspberry Pi), and CUDA GPUs—with optimized kernels for each architecture. Pre-quantized model zoo on Hugging Face ensures immediate deployment without requiring compression infrastructure.

944 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

944

Forks

95

Language

C++

License

MIT

Last pushed

Jul 04, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/mit-han-lab/TinyChatEngine"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.