yhinsson/airllm
🚀 Optimize memory for large language models, enabling 70B models on a 4GB GPU and 405B Llama3.1 on 8GB VRAM without compression techniques.
13
/ 100
Experimental
No License
No Package
No Dependents
Maintenance
10 / 25
Adoption
2 / 25
Maturity
1 / 25
Community
0 / 25
Stars
2
Forks
—
Language
—
License
—
Category
Last pushed
Feb 03, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/yhinsson/airllm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
83
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline....
65
GradientHQ/parallax
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere
60
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。
55
MediaBrain-SJTU/MING
明医 (MING):中文医疗问诊大模型
49