airllm and Awesome-Chinese-LLM
A high-efficiency LLM inference engine is a complement to a curated list of deployable Chinese LLMs, as the former provides the computational infrastructure to run models identified by the latter.
About airllm
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
About Awesome-Chinese-LLM
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Provides a curated registry of 100+ open-source Chinese LLM resources including base models (ChatGLM, Qwen, LLaMA, Baichuan), domain-specific fine-tuned variants (medical, legal, financial), and training/inference frameworks. Organizes models by parameter scale and commercial usability, with comparative tables detailing training tokens, context lengths, and architecture choices like Multi-Query Attention for efficiency. Covers the full LLM lifecycle: datasets, supervised fine-tuning, preference alignment, evaluation benchmarks, deployment frameworks (vLLM, llama.cpp), and applied tutorials for LangChain integration and agent implementation.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work